Re: [ceph-users] 403-Forbidden error using radosgw
On 07/16/2014 07:58 AM, lakshmi k s wrote: Hello Ceph Users - My Ceph setup consists of 1 admin node, 3 OSDs, I radosgw and 1 client. One of OSD node also hosts monitor node. Ceph Health is OK and I have verified the radosgw runtime. I have created S3 and Swift users using radosgw-admin. But when I try to make any S3 or Swift calls, everything falls apart. For example - Python script - import boto import boto.s3.connection access_key = '123' secret_key = '456' Are you sure the access and secret key are correct? See my lines a bit below. conn = boto.connect_s3( aws_access_key_id = access_key, aws_secret_access_key = secret_key, host = 'ceph-gateway.ex.com', is_secure=False, calling_format = boto.s3.connection.OrdinaryCallingFormat(), ) for bucket in conn.get_all_buckets(): print "{name}\t{created}".format( name = bucket.name, created = bucket.creation_date, ) Client error- Traceback (most recent call last): File "dconnect.py", line 18, in for bucket in conn.get_all_buckets(): File "/usr/lib/python2.7/dist-packages/boto/s3/connection.py", line 387, in get_all_buckets response.status, response.reason, body) boto.exception.S3ResponseError: S3ResponseError: 403 Forbidden AccessDenied Radosgw log 2014-07-15 22:48:15.769125 7fbb85fdb7001 == starting new request req=0x7fbbe910b290 = 2014-07-15 22:48:15.769443 7fbb85fdb7002 req 17:0.000334::GET http://ceph-gateway.ex.com/::initializing 2014-07-15 22:48:15.769998 7fbb85fdb700 10 s->object= s->bucket= 2014-07-15 22:48:15.770199 7fbb85fdb7002 req 17:0.001084:s3:GET http://ceph-gateway.ex.com/::getting op 2014-07-15 22:48:15.770345 7fbb85fdb7002 req 17:0.001231:s3:GET http://ceph-gateway.ex.com/:list_buckets:authorizing 2014-07-15 22:48:15.770846 7fbb85fdb700 20 get_obj_state: rctx=0x7fbbc800f750 obj=.users:I420IKX56ZP09BTN4CML state=0x7fbbc8007c08 s->prefetch_data=0 2014-07-15 22:48:15.771314 7fbb85fdb700 10 cache get: name=.users+I420IKX56ZP09BTN4CML : hit 2014-07-15 22:48:15.771442 7fbb85fdb700 20 get_obj_state: s->obj_tag was set empty 2014-07-15 22:48:15.771537 7fbb85fdb700 10 cache get: name=.users+I420IKX56ZP09BTN4CML : hit 2014-07-15 22:48:15.773278 7fbb85fdb700 20 get_obj_state: rctx=0x7fbbc800f750 obj=.users.uid:lakshmi state=0x7fbbc8008208 s->prefetch_data=0 2014-07-15 22:48:15.773288 7fbb85fdb700 10 cache get: name=.users.uid+lakshmi : hit 2014-07-15 22:48:15.773293 7fbb85fdb700 20 get_obj_state: s->obj_tag was set empty 2014-07-15 22:48:15.773297 7fbb85fdb700 10 cache get: name=.users.uid+lakshmi : hit 2014-07-15 22:48:15.774247 7fbb85fdb700 10 get_canon_resource(): dest=http://ceph-gateway.ex.com/ 2014-07-15 22:48:15.774326 7fbb85fdb700 10 auth_hdr: GET Wed, 16 Jul 2014 05:48:48 GMT http://ceph-gateway.ex.com/ 2014-07-15 22:48:15.775425 7fbb85fdb700 15 calculated digest=k80Z0p3KlwX4TtrZa0Ws0IWCpVU= 2014-07-15 22:48:15.775498 7fbb85fdb700 15 auth_sign=aAd2u8uD1x/FwLAojm+vceWaITY= 2014-07-15 22:48:15.775536 7fbb85fdb700 15 compare=-10 2014-07-15 22:48:15.775603 7fbb85fdb700 10 failed to authorize request That tells you that the gateway calculated a different signature then your client did. So something with the access and secret key is wrong. Wido 2014-07-15 22:48:15.776202 7fbb85fdb7002 req 17:0.007071:s3:GET http://ceph-gateway.ex.com/:list_buckets:http status=403 2014-07-15 22:48:15.776325 7fbb85fdb7001 == req done req=0x7fbbe910b290 http_status=403 == 2014-07-15 22:48:15.776435 7fbb85fdb700 20 process_request() returned -1 Using Swift-Client - swift --debug -V 1.0 -A http://ceph-gateway.ex.com/auth/1.0 -U ganapati:swift -K "GIn60fmdvnEh5tSiRziixcO5wVxZjg9eoYmtX3hJ" list INFO:urllib3.connectionpool:Starting new HTTP connection (1): ceph-gateway.ex.com DEBUG:urllib3.connectionpool:Setting read timeout to DEBUG:urllib3.connectionpool:"GET /auth/1.0 HTTP/1.1" 403 23 ('lks: response %s', ) INFO:swiftclient:REQ: curl -i http://ceph-gateway.ex.com/auth/1.0 -X GET INFO:swiftclient:RESP STATUS: 403 Forbidden INFO:swiftclient:RESP HEADERS: [('date', 'Wed, 16 Jul 2014 05:45:22 GMT'), ('accept-ranges', 'bytes'), ('content-type', 'application/json'), ('content-length', '23'), ('server', 'Apache/2.4.7 (Ubuntu)')] INFO:swiftclient:RESP BODY: {"Code":"AccessDenied"} ERROR:swiftclient:Auth GET failed: http://ceph-gateway.ex.com/auth/1.0 403 Forbidden Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/swiftclient/client.py", line 1187, in _retry self.url, self.token = self.get_auth() File "/usr/lib/python2.7/dist-packages/swiftclient/client.py", line 1161, in get_auth insecure=self.insecure) File "/usr/lib/python2.7/dist-packages/swiftclient/client.py", line 324, in get_auth insecure=insecure) File "/usr/lib/python2.7/dist-packages/swiftclient/client.py", line 247, in get_auth_1_0 http_reason=resp.reason) ClientException: Auth GET failed: http://ceph-gateway.ex.com/auth/1
Re: [ceph-users] basic questions about pool
thank you very much, Karan, for your explanation. Regards Pragya Jain On Tuesday, 15 July 2014 1:53 PM, Karan Singh wrote: > > >Hi Pragya > > >Let me try to answer these. > > >1# The decisions is based on your use case ( performance , reliability ) .If >you need high performance out of your cluster , the deployer will create a >pool on SSD and assign this pool to applications which require higher I/O. For >Ex : if you integrate openstack with Ceph , you can instruct openstack >configuration files to write data to a specific ceph pool. >(http://ceph.com/docs/master/rbd/rbd-openstack/#configuring-glance) , >similarly you can instruct CephFS and RadosGW with pool to use for data >storage. > > >2# Usually the end user (client to ceph cluster) does not bother about where >the data is getting stored , which pool its using , and what is the real >physical locate of data. End user will demand for specific performance , >reliability and availability. It is the job of Ceph admin to fulfil their >storage requirements, out of Ceph functionalities of SSD , Erausre codes , >replication level etc. > > > > >Block Device :- End user will instruct the application ( Qemu / KVM , >OpenStack etc ) , which pool it should for data storage. rbd is the default >pool for block device. >CephFS :- End user will mount this pool as filesystem and can use further. >Default pool are data and metadata . > RadosGW :- End user will storage objects using S3 or Swift API. > > > > > >- Karan Singh - > >On 15 Jul 2014, at 07:42, pragya jain wrote: > >thank you very much, Craig, for your clear explanation against my questions. >> >> >>Now I am very clear about the concept of pools in ceph. >> >> >>But I have two small questions: >>1. How does the deployer decide that a particular type of information will be >>stored in a particular pool? Are there any settings at the time of creation >>of pool that a deployer should make to ensure that which type of data will be >>stored in which pool? >> >> >>2. How does an end-user specify that his/her data will be stored in which >>pool? how can an end-user come to know which pools are stored on SSDs or on >>HDDs, what are the properties of a particular pool? >> >> >>Thanks again, Please help to clear these confusions also. >> >> >>Regards >>Pragya Jain >> >> >> >>On Sunday, 13 July 2014 5:04 AM, Craig Lewis >>wrote: >> >> >>> >>> >>>I'll answer out of order. >>> >>> >>>#2: rdb is used for RDB images. data and metadata are used by CephFS. >>>RadosGW's default pools will be created the first time radosgw starts up. >>>If you aren't using RDB or CephFS, you can ignore those pools. >>> >>> >>>#1: RadosGW will use several pools to segregate it's data. There are a >>>couple pools for store user/subuser information, as well as pools for >>>storing the actual data. I'm using federation, and I have a total of 18 >>>pools that RadosGW is using in some form. Pools are a way to logically >>>separate your data, and pools can also have different replication/storage >>>settings. For example, I could say that the .rgw.buckets.index pool needs >>>4x replication and is only stored on SSDs, while .rgw.bucket is 3x >>>replication on HDDs. >>> >>> >>>#3: In addition to #1, you can setup different pools to actually store user >>>data in RadosGW. For example, an end user may have some very important data >>>that you want replicated 4 times, and some other data that needs to be >>>stored on SSDs for low latency. Using CRUSH, you would create the some >>>rados pools with those specs. Then you'd setup some placement targets in >>>RadosGW that use those pools. A user that cares will specify a placement >>>target when they create a bucket. That way they can decide what the storage >>>requirements are. If they don't care, then they can just use the default. >>> >>> >>>Does that help? >>> >>> >>> >>> >>> >>>On Thu, Jul 10, 2014 at 11:34 PM, pragya jain wrote: >>> >>>hi all, I have some very basic questions about pools in ceph. According to ceph documentation, as we deploy a ceph cluster with radosgw instance over it, ceph creates pool by default to store the data or the deployer can also create pools according to the requirement. Now, my question is: 1. what is the relevance of multiple pools in a cluster? i.e. why should a deployer create multiple pools in a cluster? what should be the benefits of creating multiple pools? 2. according to the docs, the default pools are data, metadata, and rbd. what is the difference among these three types of pools? 3. when a system deployer has deployed a ceph cluster with radosgw interface and start providing services to the end-user, such as, end-user can create their account on the ceph cluster and can store/retrieve their data to/from the cluster, then Is the end user has any concern about the pools created in the
Re: [ceph-users] scrub error on firefly
Hi Randy, This is the same kernel we reproduced the issue on as well. Sam traced this down to the XFS allocation hint ioctl we recently started using for RBD. We've just pushed out a v0.80.4 firefly release that disables the hint by default. It should stop the inconsistencies from popping up, although you will need to use ceph pg repair to fix the existing inconsistencies. sage On Mon, 14 Jul 2014, Randy Smith wrote: > $ lsb_release -a > LSB Version: > core-2.0-amd64:core-2.0-noarch:core-3.0-amd64:core-3.0-noarch:core-3.1-amd > 64:core-3.1-noarch:core-3.2-amd64:core-3.2-noarch:core-4.0-amd64:core-4.0-n > oarch > Distributor ID: Ubuntu > Description: Ubuntu 12.04.4 LTS > Release: 12.04 > Codename: precise > > $ uname -a > Linux droopy 3.2.0-64-generic #97-Ubuntu SMP Wed Jun 4 22:04:21 UTC 2014 > x86_64 x86_64 x86_64 GNU/Linux > > > > On Sat, Jul 12, 2014 at 3:21 PM, Samuel Just wrote: > > Also, what distribution and kernel version are you using? > -Sam > > On Jul 12, 2014 10:46 AM, "Samuel Just" > wrote: > When you see another one, can you include the xattrs > on the files as > well (you can use the attr(1) utility)? > -Sam > > On Sat, Jul 12, 2014 at 9:51 AM, Randy Smith > wrote: > > That image is the root file system for a linux > ldap server. > > > > -- > > Randall Smith > > Adams State University > > www.adams.edu > > 719-587-7741 > > > > On Jul 12, 2014 10:34 AM, "Samuel Just" > wrote: > >> > >> Here's a diff of the two files. One of the two > files appears to > >> contain ceph leveldb keys? Randy, do you have an > idea of what this > >> rbd image is being used for (rb.0.b0ce3.238e1f29, > that is). > >> -Sam > >> > >> On Fri, Jul 11, 2014 at 7:25 PM, Randy Smith > wrote: > >> > Greetings, > >> > > >> > Well it happened again with two pgs this time, > still in the same rbd > >> > image. > >> > They are at > http://people.adams.edu/~rbsmith/osd.tar. I think I > grabbed > >> > the > >> > files correctly. If not, let me know and I'll > try again on the next > >> > failure. > >> > It certainly is happening often enough. > >> > > >> > > >> > On Fri, Jul 11, 2014 at 3:39 PM, Samuel Just > > >> > wrote: > >> >> > >> >> And grab the xattrs as well. > >> >> -Sam > >> >> > >> >> On Fri, Jul 11, 2014 at 2:39 PM, Samuel Just > > >> >> wrote: > >> >> > Right. > >> >> > -Sam > >> >> > > >> >> > On Fri, Jul 11, 2014 at 2:05 PM, Randy Smith > > >> >> > wrote: > >> >> >> Greetings, > >> >> >> > >> >> >> I'm using xfs. > >> >> >> > >> >> >> Also, when, in a previous email, you asked > if I could send the > >> >> >> object, > >> >> >> do > >> >> >> you mean the files from each server named > something like this: > >> >> >> > >> >> >> > >> >> > >>./3.c6_head/DIR_6/DIR_C/DIR_5/rb.0.b0ce3.238e1f29.000b__head_34DC35 > C6__3 > >> >> >> ? > >> >> >> > >> >> >> > >> >> >> On Fri, Jul 11, 2014 at 2:00 PM, Samuel > Just > >> >> >> wrote: > >> >> >>> > >> >> >>> Also, what filesystem are you using? > >> >> >>> -Sam > >> >> >>> > >> >> >>> On Fri, Jul 11, 2014 at 10:37 AM, Sage > Weil > >> >> >>> wrote: > >> >> >>> > One other thing we might also try is > catching this earlier (on > >> >> >>> > first > >> >> >>> > read > >> >> >>> > of corrupt data) instead of waiting for > scrub. If you are not > >> >> >>> > super > >> >> >>> > performance sensitive, you can add > >> >> >>> > > >> >> >>> > filestore sloppy crc = true > >> >> >>> > filestore sloppy crc block size = > 524288 > >> >> >>> > > >> >> >>> > That will track and verify CRCs on any > large (>512k) writes. > >> >> >>> > Smaller > >> >> >>> > block sizes will give more precision and > more checks, but will > >> >> >>> > generate > >> >> >>> >
[ceph-users] v0.80.4 Firefly released
This Firefly point release fixes an potential data corruption problem when ceph-osd daemons run on top of XFS and service Firefly librbd clients. A recently added allocation hint that RBD utilizes triggers an XFS bug on some kernels (Linux 3.2, and likely others) that leads to data corruption and deep-scrub errors (and inconsistent PGs). This release avoids the situation by disabling the allocation hint until we can validate which kernels are affected and/or are known to be safe to use the hint on. We recommend that all v0.80.x Firefly users urgently upgrade, especially if they are using RBD. Notable Changes --- * osd: disable XFS extsize hint by default (#8830, Samuel Just) * rgw: fix extra data pool default name (Yehuda Sadeh) For more detailed information, see: http://ceph.com/docs/master/_downloads/v0.80.4.txt Getting Ceph * Git at git://github.com/ceph/ceph.git * Tarball at http://ceph.com/download/ceph-0.80.4.tar.gz * For packages, see http://ceph.com/docs/master/install/get-packages * For ceph-deploy, see http://ceph.com/docs/master/install/install-ceph-deploy ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-fuse couldn't be connect.
On Tue, Jul 15, 2014 at 10:15 AM, Jaemyoun Lee wrote: > The output is nothing because ceph-fuse fell into an infinite while loop as > I explain below. > > Where can I find the log file of ceph-fuse? It defaults to /var/log/ceph, but it may be empty. I realize the task may have hung, but I'm pretty sure it isn't looping, just waiting on some kind of IO. You could try running it with the "--debug-ms 1 --debug-client 10" command-line options appended and see what it spits out. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] the differences between snap and clone in terms of implement
Okay, first the basics: cls_rbd.cc operates only on rbd header objects, so it's doing coordinating activities, not the actual data handling. When somebody does an operation on an rbd image, they put some data in the header object so that everybody else can coordinate (if it's open) or continue (if they're opening it later). A "snap" is a snapshot of the RBD image, which does exactly what snapshots always do (from a user perspective): you can use the snapshot name/ID to refer to the exact data that's on-disk now. Writes that happen after a snapshot are COWed, so you only store deltas, not a full copy of the image. Snapshots in RADOS are not themselves writeable. A "clone" is designed a bit differently: it lets you create a new, *writeable* image which maps back to the base image for any unwritten extents. Unlike a snapshot, you can write to the clone and it diverges from any other clones or snapshots which are taken on the base image. So yes, neither one does any real data write in cls_rbd, because they're just updating pointers used by the clients so that the data is arranged properly when the clients access the rbd image. However, a snapshot is a property of the image, and its data lives next to the rest of the images data; the clones are creating new images which live somewhere else. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Jul 15, 2014 at 9:14 AM, fastsync wrote: > hi,all > i take a glance at ceph code of cls_rbd.cc. > it seems that snap and clone both do not R/W any data, they just add some > keys and values, even rbds in different pools. > am i missing something? or could you explain deeper about the implemention > of snap and clone. > > > thanks very much. > > > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-fuse couldn't be connect.
The output is nothing because ceph-fuse fell into an infinite while loop as I explain below. Where can I find the log file of ceph-fuse? Jae. 2014. 7. 16. 오전 1:59에 "Gregory Farnum" 님이 작성: > What did ceph-fuse output to its log file or the command line? > > On Tuesday, July 15, 2014, Jaemyoun Lee wrote: > >> Hi All, >> >> I am using ceph 0.80.1 on Ubuntu 14.04 on KVM. However, I cannot connect >> to the MON from a client using ceph-fuse. >> >> On the client, I installed the ceph-fuse 0.80.1 and added fuse. But, I >> think it is wrong. The result is >> >> # modprobe fuse >> (Any output was nothing) >> # lsmod | grep fuse >> (Any output was nothing) >> # ceph-fuse -m 192.168.122.106:6789 /mnt >> ceph-fuse[1905]: starting ceph client >> (at this point, ceph-fuse fell into an infinite while loop) >> ^C >> # >> >> What problem is it? >> >> My cluster like the follow: >> >> Host OS (Ubuntu 14.04) >> --- VM-1 (Ubuntu 14.04) >> -- MON-0 >> -- MDS-0 >> --- VM-2 (Ubuntu 14.04) >> -- OSD-0 >> --- VM-3 (Ubuntu 14.04) >> -- OSD-1 >> -- OSD-2 >> -- OSD-3 >> --- VM-4 (Ubuntu 14.04) >> -- it's for client. >> >> the result of "# ceph -s" on VM-1, which is MON, is >> >> # ceph -s >> cluster 1ae5585d-03c6-4a57-ba79-c65f4ed9e69f >> >> health HEALTH_OK >> >> monmap e1: 1 mons at {csA=192.168.122.106:6789/0}, election epoch >> 1, quorum 0 csA >> >> osdmap e37: 4 osds: 4 up, 4 in >> >> pgmap v678: 192 pgs, 3 pools, 0 bytes data, 0 objects >> >> 20623 MB used, 352 GB / 372 GB avail >> >> 192 active+clean >> >> # >> >> Regards, >> Jae >> >> -- >> 이재면 Jaemyoun Lee >> >> E-mail : jaemy...@gmail.com >> Homepage : http://jaemyoun.com >> Facebook : https://www.facebook.com/jaemyoun >> > > > -- > Software Engineer #42 @ http://inktank.com | http://ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] 0.80.1 to 0.80.3: strange osd log messages
Dzianis Kahanovich пишет: Dzianis Kahanovich пишет: After upgrading 0.80.1 to 0.80.3 I see many regular messages on every OSD log: 2014-07-15 19:44:48.292839 7fa5a659f700 0 osd.5 62377 crush map has features 2199057072128, adjusting msgr requires for mons (constant part: "crush map has features 2199057072128, adjusting msgr requires for mons") HEALTH_OK, tunables optimal. What is it? Suddenly stopped. I reload crush map (over crustool & text) and wait near 5 minutes after HEALTH_OK. Previously I see it in git emperor build near 1 day and fallback to 0.80.1. But better to know anymore. Oh, no. Started again after 8 minutes. -- WBR, Dzianis Kahanovich AKA Denis Kaganovich, http://mahatma.bspu.unibel.by/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] 0.80.1 to 0.80.3: strange osd log messages
Dzianis Kahanovich пишет: After upgrading 0.80.1 to 0.80.3 I see many regular messages on every OSD log: 2014-07-15 19:44:48.292839 7fa5a659f700 0 osd.5 62377 crush map has features 2199057072128, adjusting msgr requires for mons (constant part: "crush map has features 2199057072128, adjusting msgr requires for mons") HEALTH_OK, tunables optimal. What is it? Suddenly stopped. I reload crush map (over crustool & text) and wait near 5 minutes after HEALTH_OK. Previously I see it in git emperor build near 1 day and fallback to 0.80.1. But better to know anymore. -- WBR, Dzianis Kahanovich AKA Denis Kaganovich, http://mahatma.bspu.unibel.by/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-fuse couldn't be connect.
What did ceph-fuse output to its log file or the command line? On Tuesday, July 15, 2014, Jaemyoun Lee wrote: > Hi All, > > I am using ceph 0.80.1 on Ubuntu 14.04 on KVM. However, I cannot connect > to the MON from a client using ceph-fuse. > > On the client, I installed the ceph-fuse 0.80.1 and added fuse. But, I > think it is wrong. The result is > > # modprobe fuse > (Any output was nothing) > # lsmod | grep fuse > (Any output was nothing) > # ceph-fuse -m 192.168.122.106:6789 /mnt > ceph-fuse[1905]: starting ceph client > (at this point, ceph-fuse fell into an infinite while loop) > ^C > # > > What problem is it? > > My cluster like the follow: > > Host OS (Ubuntu 14.04) > --- VM-1 (Ubuntu 14.04) > -- MON-0 > -- MDS-0 > --- VM-2 (Ubuntu 14.04) > -- OSD-0 > --- VM-3 (Ubuntu 14.04) > -- OSD-1 > -- OSD-2 > -- OSD-3 > --- VM-4 (Ubuntu 14.04) > -- it's for client. > > the result of "# ceph -s" on VM-1, which is MON, is > > # ceph -s > cluster 1ae5585d-03c6-4a57-ba79-c65f4ed9e69f > > health HEALTH_OK > > monmap e1: 1 mons at {csA=192.168.122.106:6789/0}, election epoch 1, > quorum 0 csA > > osdmap e37: 4 osds: 4 up, 4 in > > pgmap v678: 192 pgs, 3 pools, 0 bytes data, 0 objects > > 20623 MB used, 352 GB / 372 GB avail > > 192 active+clean > > # > > Regards, > Jae > > -- > 이재면 Jaemyoun Lee > > E-mail : jaemy...@gmail.com > > Homepage : http://jaemyoun.com > Facebook : https://www.facebook.com/jaemyoun > -- Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] 0.80.1 to 0.80.3: strange osd log messages
After upgrading 0.80.1 to 0.80.3 I see many regular messages on every OSD log: 2014-07-15 19:44:48.292839 7fa5a659f700 0 osd.5 62377 crush map has features 2199057072128, adjusting msgr requires for mons (constant part: "crush map has features 2199057072128, adjusting msgr requires for mons") HEALTH_OK, tunables optimal. What is it? -- WBR, Dzianis Kahanovich AKA Denis Kaganovich, http://mahatma.bspu.unibel.by/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] [ANN] ceph-deploy 1.5.9 released
Hi All, There is a new release of ceph-deploy, the easy deployment tool for Ceph. There is a minor cleanup when ceph-deploy disconnects from remote hosts that was creating some tracebacks. And there is a new flag for the `new` subcommand that allows to specify an fsid for the cluster. The full list of fixes for this release can be found in the changelog: http://ceph.com/ceph-deploy/docs/changelog.html#id1 Make sure you update! Alfredo ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] create a image that stores information in erasure-pool failed
You can't use erasure coded pools directly with RBD. They're only suitable for use with RGW or as the base pool for a replicated cache pool, and you need to be very careful/specific with the configuration. I believe this is well-documented, so check it out! :) -Greg On Saturday, July 12, 2014, qixiaof...@chinacloud.com.cn < qixiaof...@chinacloud.com.cn> wrote: > hi,all: >I created an erasure-code pool ,I create a 1GB image named fool that > stores information in the erasure-code pool,however the action > failed,with tips as follows: > > root@mon1:~# rbd create foo --size 1024 --pool ecpool > rbd: create error: (95) Operation not supported2014-07-13 10:32:55.311330 > 7f1b6563f780 -1 librbd: error adding image to directory: (95) Operation not > supported > > since I created another image in general pool ,the action succeed.so I > wonder wheter it's my mistake or it's a problem in erasure-code pool.Could > you help me solving this puzzing problem?thank you very much! > my cluster are deployed as follows: > > one monitor > six osds > EC pool :ecpool with 100pgs,profile:jerasure,k=4,m=2,reed_sol_van > > > yours sincerely, > ifstillfly > > -- > qixiaof...@chinacloud.com.cn > > -- Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RGW: Get object ops performance problem
Are you saturating your network bandwidth? That's what it sounds like. :) -Greg On Monday, July 14, 2014, baijia...@126.com wrote: > hi, everyone! > > I test RGW get obj ops, when I use 100 threads get one and the same > object , I find that performance is very good, meadResponseTime is 0.1s. > But when I use 150 threads get one and the same object, performace is > very bad, meadResponseTime is 1s. > > and I observe the osd log and rgw log, > rgw log: > 2014-07-15 10:36:42.999719 7f45596fb700 1 -- 10.0.1.61:0/1022376 --> > 10.0.0.21:6835/24201 -- osd_op(client.6167.0:22721 default.5632.8_ws1411.jpg > [getxattrs,stat,read 0~524288] 4.5210f70b ack+read e657) > 2014-07-15 10:36:44.064720 7f467efdd700 1 -- 10.0.1.61:0/1022376 <== > osd.7 10.0.0.21:6835/24201 22210 osd_op_reply(22721 > > osd log: > 10:36:43.001895 7f6cdb24c700 1 -- 10.0.0.21:6835/24201 <== client.6167 > 10.0.1.61:0/1022376 22436 osd_op(client.6167.0:22721 > default.5632.8_ws1411.jpg > > 2014-07-15 10:36:43.031762 7f6cbf01f700 1 -- 10.0.0.21:6835/24201 --> > 10.0.1.61:0/1022376 -- osd_op_reply(22721 default.5632.8_ws1411.jpg > > so I think the problem is not happened in the osd, why osd send op replay > in 10:36:43.031762 , but rgw receive in 10:36:44.064720 ? > > > -- > baijia...@126.com > -- Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] how to plan the ceph storage architecture when i reuse old PC Server
It's generally recommended that you use disks in JBOD mode rather than involving RAID. -Greg On Monday, July 14, 2014, 不坏阿峰 wrote: > I have installed and test Ceph on VMs before, i know a bit about > configuration and install. > Now i want to use physic PC Server to install Ceph and do some Test, i > think the prefermance will better than VMs. I have some question about how > to plan the ceph storage architecture. > > >>> what do i have as below: > 1. only one Intel SSD 520 120G, plan used for Ceph journal, to speed the > Ceph prefermance. > 2. some old PC Server with Array Controller, include 4 x 300G SAS HDD > 3. Gigabit Switch > > >>> what do i plan: > 1. Ceph cluster such as: A: osd+mon, B: osd+mon , two PC server Ceph > cluster > 2. the only one SSD install on A, use for Ceph journal > 3. use Nic bind(LACP) on Server, and Switch port configure LACP > 4. i plan to use Debian 7 testing > > >>> what shall i ask: > 1. shall i use Array controller config Raid or not ? or how to use the > HDD better, use HDD directly, no need raid ? > 2. i have only one SSD HDD for journal, so what shall i modify in my plan > ?? and how to config journal to SSD ? > > hope some one can give me some guide. > Many thanks in advanced. > -- Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] the differences between snap and clone in terms of implement
hi,all i take a glance at ceph code of cls_rbd.cc. it seems that snap and clone both do not R/W any data, they just add some keys and values, even rbds in different pools. am i missing something? or could you explain deeper about the implemention of snap and clone. thanks very much. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Working ISCSI target guide
hi. there may be 2 ways. but cephfs is not product-ready. 1. you can use a file stored in cephfs as a target. 2.there is a rbd.ko which map a rbd device as a block device, which you can assign to target. i have not tested yet. good luck At 2014-07-15 09:18:53, "Drew Weaver" wrote: One other question, if you are going to be using Ceph as a storage system for KVM virtual machines does it even matter if you use ISCSI or not? Meaning that if you are just going to use LVM and have several hypervisors sharing that same VG then using ISCSI isn’t really a requirement unless you are using a Hypervisor like ESXi which only works with ISCSI/NFS correct? Thanks, -Drew From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Drew Weaver Sent: Tuesday, July 15, 2014 9:03 AM To: 'ceph-users@lists.ceph.com' Subject: [ceph-users] Working ISCSI target guide Does anyone have a guide or re-producible method of getting multipath ISCSI working infront of ceph? Even if it just means having two front-end ISCSI targets each with access to the same underlying Ceph volume? This seems like a super popular topic. Thanks, -Drew ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph osd crush tunables optimal AND add new OSD at the same time
On Tue, 15 Jul 2014, Andrija Panic wrote: > Hi Sage, since this problem is tunables-related, do we need to expect > same behavior or not when we do regular data rebalancing caused by > adding new/removing OSD? I guess not, but would like your confirmation. > I'm already on optimal tunables, but I'm afraid to test this by i.e. > shuting down 1 OSD. When you shut down a single OSD it is a relativey small amount of data that needs to move to do the recovery. The issue with the tunables is just that a huge fraction of the data stored needs to move, and the performance impact is much higher. sage > > Thanks, > Andrija > > > On 14 July 2014 18:18, Sage Weil wrote: > I've added some additional notes/warnings to the upgrade and > release > notes: > > https://github.com/ceph/ceph/commit/fc597e5e3473d7db6548405ce347ca77328324 > 51 > > If there is somewhere else where you think a warning flag would > be useful, > let me know! > > Generally speaking, we want to be able to cope with huge data > rebalances > without interrupting service. It's an ongoing process of > improving the > recovery vs client prioritization, though, and removing sources > of > overhead related to rebalancing... and it's clearly not perfect > yet. :/ > > sage > > > On Sun, 13 Jul 2014, Andrija Panic wrote: > > > Hi, > > after seting ceph upgrade (0.72.2 to 0.80.3) I have issued > "ceph osd crush > > tunables optimal" and after only few minutes I have added 2 > more OSDs to the > > CEPH cluster... > > > > So these 2 changes were more or a less done at the same time - > rebalancing > > because of tunables optimal, and rebalancing because of adding > new OSD... > > > > Result - all VMs living on CEPH storage have gone mad, no disk > access > > efectively, blocked so to speak. > > > > Since this rebalancing took 5h-6h, I had bunch of VMs down for > that long... > > > > Did I do wrong by causing "2 rebalancing" to happen at the > same time ? > > Is this behaviour normal, to cause great load on all VMs > because they are > > unable to access CEPH storage efectively ? > > > > Thanks for any input... > > -- > > > > Andrija Pani? > > > > > > > > > -- > > Andrija Pani? > -- > http://admintweets.com > -- > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] ceph-fuse couldn't be connect.
Hi All, I am using ceph 0.80.1 on Ubuntu 14.04 on KVM. However, I cannot connect to the MON from a client using ceph-fuse. On the client, I installed the ceph-fuse 0.80.1 and added fuse. But, I think it is wrong. The result is # modprobe fuse (Any output was nothing) # lsmod | grep fuse (Any output was nothing) # ceph-fuse -m 192.168.122.106:6789 /mnt ceph-fuse[1905]: starting ceph client (at this point, ceph-fuse fell into an infinite while loop) ^C # What problem is it? My cluster like the follow: Host OS (Ubuntu 14.04) --- VM-1 (Ubuntu 14.04) -- MON-0 -- MDS-0 --- VM-2 (Ubuntu 14.04) -- OSD-0 --- VM-3 (Ubuntu 14.04) -- OSD-1 -- OSD-2 -- OSD-3 --- VM-4 (Ubuntu 14.04) -- it's for client. the result of "# ceph -s" on VM-1, which is MON, is # ceph -s cluster 1ae5585d-03c6-4a57-ba79-c65f4ed9e69f health HEALTH_OK monmap e1: 1 mons at {csA=192.168.122.106:6789/0}, election epoch 1, quorum 0 csA osdmap e37: 4 osds: 4 up, 4 in pgmap v678: 192 pgs, 3 pools, 0 bytes data, 0 objects 20623 MB used, 352 GB / 372 GB avail 192 active+clean # Regards, Jae -- 이재면 Jaemyoun Lee E-mail : jaemy...@gmail.com Homepage : http://jaemyoun.com Facebook : https://www.facebook.com/jaemyoun ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Working ISCSI target guide
Drew, I would not use iscsi with ivm. instead, I would use built in rbd support. However, you would use something like nfs/iscsi if you were to connect other hypervisors to ceph backend. Having failover capabilities is important here )) Andrei -- Andrei Mikhailovsky Director Arhont Information Security Web: http://www.arhont.com http://www.wi-foo.com Tel: +44 (0)870 4431337 Fax: +44 (0)208 429 3111 PGP: Key ID - 0x2B3438DE PGP: Server - keyserver.pgp.com DISCLAIMER The information contained in this email is intended only for the use of the person(s) to whom it is addressed and may be confidential or contain legally privileged information. If you are not the intended recipient you are hereby notified that any perusal, use, distribution, copying or disclosure is strictly prohibited. If you have received this email in error please immediately advise us by return email at and...@arhont.com and delete and purge the email and any attachments without making a copy. - Original Message - From: "Drew Weaver" To: "ceph-users@lists.ceph.com" Sent: Tuesday, 15 July, 2014 2:18:53 PM Subject: Re: [ceph-users] Working ISCSI target guide One other question, if you are going to be using Ceph as a storage system for KVM virtual machines does it even matter if you use ISCSI or not? Meaning that if you are just going to use LVM and have several hypervisors sharing that same VG then using ISCSI isn’t really a requirement unless you are using a Hypervisor like ESXi which only works with ISCSI/NFS correct? Thanks, -Drew From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Drew Weaver Sent: Tuesday, July 15, 2014 9:03 AM To: 'ceph-users@lists.ceph.com' Subject: [ceph-users] Working ISCSI target guide Does anyone have a guide or re-producible method of getting multipath ISCSI working infront of ceph? Even if it just means having two front-end ISCSI targets each with access to the same underlying Ceph volume? This seems like a super popular topic. Thanks, -Drew ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] create a image that stores information in erasure-pool failed
hi,all: I created an erasure-code pool ,I create a 1GB image named fool that stores information in the erasure-code pool,however the action failed,with tips as follows: root@mon1:~# rbd create foo --size 1024 --pool ecpool rbd: create error: (95) Operation not supported2014-07-13 10:32:55.311330 7f1b6563f780 -1 librbd: error adding image to directory: (95) Operation not supported since I created another image in general pool ,the action succeed.so I wonder wheter it's my mistake or it's a problem in erasure-code pool.Could you help me solving this puzzing problem?thank you very much! my cluster are deployed as follows: one monitor six osds EC pool :ecpool with 100pgs,profile:jerasure,k=4,m=2,reed_sol_van yours sincerely, ifstillfly qixiaof...@chinacloud.com.cn___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Working ISCSI target guide
One other question, if you are going to be using Ceph as a storage system for KVM virtual machines does it even matter if you use ISCSI or not? Meaning that if you are just going to use LVM and have several hypervisors sharing that same VG then using ISCSI isn't really a requirement unless you are using a Hypervisor like ESXi which only works with ISCSI/NFS correct? Thanks, -Drew From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Drew Weaver Sent: Tuesday, July 15, 2014 9:03 AM To: 'ceph-users@lists.ceph.com' Subject: [ceph-users] Working ISCSI target guide Does anyone have a guide or re-producible method of getting multipath ISCSI working infront of ceph? Even if it just means having two front-end ISCSI targets each with access to the same underlying Ceph volume? This seems like a super popular topic. Thanks, -Drew ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Working ISCSI target guide
Does anyone have a guide or re-producible method of getting multipath ISCSI working infront of ceph? Even if it just means having two front-end ISCSI targets each with access to the same underlying Ceph volume? This seems like a super popular topic. Thanks, -Drew ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Placing different pools on different OSDs in the same physical servers
Hi, to avoid confusion I would name the "host" entries in the crush map differently. Make sure these host names can be resolved to the correct boxes though (/etc/hosts on all the nodes). You're also missing a new rule entry (also shown in the link you mentioned). Lastly, and this is *extremely* important: You need to set [global] osd crush update on start = false in your ceph.conf because there is currently no logic for OSDs to detect their location with different roots present as "documented" here: http://tracker.ceph.com/issues/6227 If you don't set this, whenever you start an OSD belonging to your SSD root, it will move the OSD over to the default root. Side note: this is really unfortunate since with cache pools it is now common to have platters and SSDs on the same physical hosts and also multiple parallel roots. On 10/07/2014 17:04, Nikola Pajtic wrote: > Hello to all, > > I was wondering is it possible to place different pools on different > OSDs, but using only two physical servers? > > I was thinking about this: http://tinypic.com/r/30tgt8l/8 > > I would like to use osd.0 and osd.1 for Cinder/RBD pool, and osd.2 and > osd.3 for Nova instances. I was following the howto from ceph > documentation: > http://ceph.com/docs/master/rados/operations/crush-map/#placing-different-pools-on-different-osds > , but it assumed that there are 4 physical servers: 2 for "Platter" pool > and 2 for "SSD" pool. > > What I was concerned about is how the CRUSH map should be written and > how the CRUSH will decide where it will send the data? Because of the > the same hostnames in cinder and nova pools. For example, is it possible > to do something like this: > > > # buckets > host cephosd1 { > id -2 # do not change unnecessarily > # weight 0.010 > alg straw > hash 0 # rjenkins1 > item osd.0 weight 0.000 > } > > host cephosd1 { > id -3 # do not change unnecessarily > # weight 0.010 > alg straw > hash 0 # rjenkins1 > item osd.2 weight 0.010 > } > > host cephosd2 { > id -4 # do not change unnecessarily > # weight 0.010 > alg straw > hash 0 # rjenkins1 > item osd.1 weight 0.000 > } > > host cephosd2 { > id -5 # do not change unnecessarily > # weight 0.010 > alg straw > hash 0 # rjenkins1 > item osd.3 weight 0.010 > } > > root cinder { > id -1 # do not change unnecessarily > # weight 0.000 > alg straw > hash 0 # rjenkins1 > item cephosd1 weight 0.000 > item cephosd2 weight 0.000 > } > > root nova { > id -6 # do not change unnecessarily > # weight 0.020 > alg straw > hash 0 # rjenkins1 > item cephosd1 weight 0.010 > item cephosd2 weight 0.010 > } > > If not, could you share an idea how this scenario could be achieved? > > Thanks in advance!! > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu image create failed
Can you connect to your Ceph cluster? You can pass options to the cmd line like this: $ qemu-img create -f rbd rbd:instances/vmdisk01:id=leseb:conf=/etc/ceph/ceph-leseb.conf 2G Cheers. Sébastien Han Cloud Engineer "Always give 100%. Unless you're giving blood." Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance On 12 Jul 2014, at 03:06, Yonghua Peng wrote: > Anybody knows this issue? thanks. > > > > > Fri, 11 Jul 2014 10:26:47 +0800 from Yonghua Peng : > Hi, > > I try to create a qemu image, but got failed. > > ceph@ceph:~/my-cluster$ qemu-img create -f rbd rbd:rbd/qemu 2G > Formatting 'rbd:rbd/qemu', fmt=rbd size=2147483648 cluster_size=0 > qemu-img: error connecting > qemu-img: rbd:rbd/qemu: error while creating rbd: Input/output error > > Can you tell what's the problem? > > Thanks. > > -- > We are hiring cloud Dev/Ops, more details please see: YY Cloud Jobs > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Ceph with Multipath ISCSI
Hello guys, I was wondering if there has been any progress on getting multipath iscsi play nicely with ceph? I've followed the how to and created a single path iscsi over ceph rbd with XenServer. However, it would be nice to have a built in failover using iscsi multipath to another ceph mon or osd server. Cheers Andrei -- Andrei Mikhailovsky Director Arhont Information Security Web: http://www.arhont.com http://www.wi-foo.com Tel: +44 (0)870 4431337 Fax: +44 (0)208 429 3111 PGP: Key ID - 0x2B3438DE PGP: Server - keyserver.pgp.com DISCLAIMER The information contained in this email is intended only for the use of the person(s) to whom it is addressed and may be confidential or contain legally privileged information. If you are not the intended recipient you are hereby notified that any perusal, use, distribution, copying or disclosure is strictly prohibited. If you have received this email in error please immediately advise us by return email at and...@arhont.com and delete and purge the email and any attachments without making a copy. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] mon doesn't start
I found the problem. Storage1 dpkg shows ceph is running on 0.80.1 and other 2 are running on 0.80.2. The upgrade follows the same method. I don't understand why. I have tried apt-get update and apt-get upgrade on storage1. How do I bring it to 0.80.2? On Tue, Jul 15, 2014 at 12:30 AM, Richard Zheng wrote: > We used to have ubuntu 12.04 with ceph 0.80.1. The upgrade just tries to > use ubuntu 14.04. We have three servers which run 1 mon and 10 OSDs > each. The other 2 servers are ok. > > Before upgrade we didn't enable upstart and just manually started mon on > all three nodes, e.g. start ceph-mon id=storage. Then we started > ceph-mon after touching done and upstart then run start ceph-all. We used > this way to start mon on storage2 and storage3 successfully. But when we > started mon on storage1, it failed. > > Any suggestion? > > > > On Mon, Jul 14, 2014 at 11:09 PM, Joao Eduardo Luis > wrote: > >> (re-cc'ing list without log file) >> >> Did you change anything in the cluster aside from ubuntu's version? >> >> Were any upgrades performed? If so, from which version to which version >> and on which monitors? >> >> -Joao >> >> >> On 07/15/2014 03:18 AM, Richard Zheng wrote: >> >>> Thanks Joao. Not sure if the log contains any potential sensitive >>> information... So I didn't reply to the list. >>> >>> On Mon, Jul 14, 2014 at 3:14 PM, Joao Eduardo Luis >>> mailto:joao.l...@inktank.com>> wrote: >>> >>> On 07/14/2014 11:51 PM, Richard Zheng wrote: >>> >>> Hi, >>> >>> We used to have 0.80.1 on Ubuntu 12.04. We recently upgraded to >>> 14.04. >>> However the mon process doesn't start while OSDs are ok. The >>> ceph-mon >>> log shows, >>> >>> 2014-07-14 12:04:17.034407 7fa86ddcb700 -1 >>> mon.storage1@0(electing).__elector(147) Shutting down because I >>> >>> do not >>> support required monitor features: { >>> compat={},rocompat={},__incompat={} } >>> >>> >>> Any suggestions? >>> >>> >>> Set 'debug mon = 10' and 'debug ms = 1' on the monitor, rerun the >>> monitor, have this happen again and drop the log somewhere we can >>> take a look at :) >>> >>>-Joao >>> >>> >>> -- >>> Joao Eduardo Luis >>> Software Engineer | http://inktank.com | http://ceph.com >>> >>> >>> >> >> -- >> Joao Eduardo Luis >> Software Engineer | http://inktank.com | http://ceph.com >> > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph osd crush tunables optimal AND add new OSD at the same time
Hi Sage, since this problem is tunables-related, do we need to expect same behavior or not when we do regular data rebalancing caused by adding new/removing OSD? I guess not, but would like your confirmation. I'm already on optimal tunables, but I'm afraid to test this by i.e. shuting down 1 OSD. Thanks, Andrija On 14 July 2014 18:18, Sage Weil wrote: > I've added some additional notes/warnings to the upgrade and release > notes: > > > https://github.com/ceph/ceph/commit/fc597e5e3473d7db6548405ce347ca7732832451 > > If there is somewhere else where you think a warning flag would be useful, > let me know! > > Generally speaking, we want to be able to cope with huge data rebalances > without interrupting service. It's an ongoing process of improving the > recovery vs client prioritization, though, and removing sources of > overhead related to rebalancing... and it's clearly not perfect yet. :/ > > sage > > > On Sun, 13 Jul 2014, Andrija Panic wrote: > > > Hi, > > after seting ceph upgrade (0.72.2 to 0.80.3) I have issued "ceph osd > crush > > tunables optimal" and after only few minutes I have added 2 more OSDs to > the > > CEPH cluster... > > > > So these 2 changes were more or a less done at the same time - > rebalancing > > because of tunables optimal, and rebalancing because of adding new OSD... > > > > Result - all VMs living on CEPH storage have gone mad, no disk access > > efectively, blocked so to speak. > > > > Since this rebalancing took 5h-6h, I had bunch of VMs down for that > long... > > > > Did I do wrong by causing "2 rebalancing" to happen at the same time ? > > Is this behaviour normal, to cause great load on all VMs because they are > > unable to access CEPH storage efectively ? > > > > Thanks for any input... > > -- > > > > Andrija Pani? > > > > -- Andrija Panić -- http://admintweets.com -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] mon doesn't start
We used to have ubuntu 12.04 with ceph 0.80.1. The upgrade just tries to use ubuntu 14.04. We have three servers which run 1 mon and 10 OSDs each. The other 2 servers are ok. Before upgrade we didn't enable upstart and just manually started mon on all three nodes, e.g. start ceph-mon id=storage. Then we started ceph-mon after touching done and upstart then run start ceph-all. We used this way to start mon on storage2 and storage3 successfully. But when we started mon on storage1, it failed. Any suggestion? On Mon, Jul 14, 2014 at 11:09 PM, Joao Eduardo Luis wrote: > (re-cc'ing list without log file) > > Did you change anything in the cluster aside from ubuntu's version? > > Were any upgrades performed? If so, from which version to which version > and on which monitors? > > -Joao > > > On 07/15/2014 03:18 AM, Richard Zheng wrote: > >> Thanks Joao. Not sure if the log contains any potential sensitive >> information... So I didn't reply to the list. >> >> On Mon, Jul 14, 2014 at 3:14 PM, Joao Eduardo Luis >> mailto:joao.l...@inktank.com>> wrote: >> >> On 07/14/2014 11:51 PM, Richard Zheng wrote: >> >> Hi, >> >> We used to have 0.80.1 on Ubuntu 12.04. We recently upgraded to >> 14.04. >> However the mon process doesn't start while OSDs are ok. The >> ceph-mon >> log shows, >> >> 2014-07-14 12:04:17.034407 7fa86ddcb700 -1 >> mon.storage1@0(electing).__elector(147) Shutting down because I >> >> do not >> support required monitor features: { >> compat={},rocompat={},__incompat={} } >> >> >> Any suggestions? >> >> >> Set 'debug mon = 10' and 'debug ms = 1' on the monitor, rerun the >> monitor, have this happen again and drop the log somewhere we can >> take a look at :) >> >>-Joao >> >> >> -- >> Joao Eduardo Luis >> Software Engineer | http://inktank.com | http://ceph.com >> >> >> > > -- > Joao Eduardo Luis > Software Engineer | http://inktank.com | http://ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] mon doesn't start
(re-cc'ing list without log file) Did you change anything in the cluster aside from ubuntu's version? Were any upgrades performed? If so, from which version to which version and on which monitors? -Joao On 07/15/2014 03:18 AM, Richard Zheng wrote: Thanks Joao. Not sure if the log contains any potential sensitive information... So I didn't reply to the list. On Mon, Jul 14, 2014 at 3:14 PM, Joao Eduardo Luis mailto:joao.l...@inktank.com>> wrote: On 07/14/2014 11:51 PM, Richard Zheng wrote: Hi, We used to have 0.80.1 on Ubuntu 12.04. We recently upgraded to 14.04. However the mon process doesn't start while OSDs are ok. The ceph-mon log shows, 2014-07-14 12:04:17.034407 7fa86ddcb700 -1 mon.storage1@0(electing).__elector(147) Shutting down because I do not support required monitor features: { compat={},rocompat={},__incompat={} } Any suggestions? Set 'debug mon = 10' and 'debug ms = 1' on the monitor, rerun the monitor, have this happen again and drop the log somewhere we can take a look at :) -Joao -- Joao Eduardo Luis Software Engineer | http://inktank.com | http://ceph.com -- Joao Eduardo Luis Software Engineer | http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] basic questions about pool
Hi Pragya Let me try to answer these. 1# The decisions is based on your use case ( performance , reliability ) .If you need high performance out of your cluster , the deployer will create a pool on SSD and assign this pool to applications which require higher I/O. For Ex : if you integrate openstack with Ceph , you can instruct openstack configuration files to write data to a specific ceph pool. (http://ceph.com/docs/master/rbd/rbd-openstack/#configuring-glance) , similarly you can instruct CephFS and RadosGW with pool to use for data storage. 2# Usually the end user (client to ceph cluster) does not bother about where the data is getting stored , which pool its using , and what is the real physical locate of data. End user will demand for specific performance , reliability and availability. It is the job of Ceph admin to fulfil their storage requirements, out of Ceph functionalities of SSD , Erausre codes , replication level etc. Block Device :- End user will instruct the application ( Qemu / KVM , OpenStack etc ) , which pool it should for data storage. rbd is the default pool for block device. CephFS :- End user will mount this pool as filesystem and can use further. Default pool are data and metadata . RadosGW :- End user will storage objects using S3 or Swift API. - Karan Singh - On 15 Jul 2014, at 07:42, pragya jain wrote: > thank you very much, Craig, for your clear explanation against my questions. > > Now I am very clear about the concept of pools in ceph. > > But I have two small questions: > 1. How does the deployer decide that a particular type of information will be > stored in a particular pool? Are there any settings at the time of creation > of pool that a deployer should make to ensure that which type of data will be > stored in which pool? > > 2. How does an end-user specify that his/her data will be stored in which > pool? how can an end-user come to know which pools are stored on SSDs or on > HDDs, what are the properties of a particular pool? > > Thanks again, Please help to clear these confusions also. > > Regards > Pragya Jain > > > On Sunday, 13 July 2014 5:04 AM, Craig Lewis > wrote: > > > I'll answer out of order. > > #2: rdb is used for RDB images. data and metadata are used by CephFS. > RadosGW's default pools will be created the first time radosgw starts up. If > you aren't using RDB or CephFS, you can ignore those pools. > > #1: RadosGW will use several pools to segregate it's data. There are a > couple pools for store user/subuser information, as well as pools for storing > the actual data. I'm using federation, and I have a total of 18 pools that > RadosGW is using in some form. Pools are a way to logically separate your > data, and pools can also have different replication/storage settings. For > example, I could say that the .rgw.buckets.index pool needs 4x replication > and is only stored on SSDs, while .rgw.bucket is 3x replication on HDDs. > > #3: In addition to #1, you can setup different pools to actually store user > data in RadosGW. For example, an end user may have some very important data > that you want replicated 4 times, and some other data that needs to be stored > on SSDs for low latency. Using CRUSH, you would create the some rados pools > with those specs. Then you'd setup some placement targets in RadosGW that > use those pools. A user that cares will specify a placement target when they > create a bucket. That way they can decide what the storage requirements are. > If they don't care, then they can just use the default. > > Does that help? > > > > On Thu, Jul 10, 2014 at 11:34 PM, pragya jain wrote: > hi all, > > I have some very basic questions about pools in ceph. > > According to ceph documentation, as we deploy a ceph cluster with radosgw > instance over it, ceph creates pool by default to store the data or the > deployer can also create pools according to the requirement. > > Now, my question is: > 1. what is the relevance of multiple pools in a cluster? > i.e. why should a deployer create multiple pools in a cluster? what should be > the benefits of creating multiple pools? > > 2. according to the docs, the default pools are data, metadata, and rbd. > what is the difference among these three types of pools? > > 3. when a system deployer has deployed a ceph cluster with radosgw interface > and start providing services to the end-user, such as, end-user can create > their account on the ceph cluster and can store/retrieve their data to/from > the cluster, then Is the end user has any concern about the pools created in > the cluster? > > Please somebody help me to clear these confusions. > > regards > Pragya Jain > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > ___ > ceph-users maili