Re: [ceph-users] ceph cluster inconsistency?
Hi sage, I'm thinking about how to encode/decode the object name to key in DB without changing sorting order. What do you think of this idea: avoid the separate character disturb the sorting order, we can introduce the length for each field. For example, [hash]$[name]$[key]$[nspace]$[snap_id]$[pool]$[generate_id]$[gen_t]$[hash_len]$[name_len].. $[gen_t_len]$version( '$' is the separate character) So separate character needn't to be escaped and we can read each field's length from the end of key. When reading keys, we will parse from the end of key to begin and quickly extract the field. Another advantage is improving performance for long key unescape. Anyone has other concerns? On Sat, Aug 16, 2014 at 10:50 PM, Haomai Wang haomaiw...@gmail.com wrote: On Fri, Aug 15, 2014 at 9:10 PM, Sage Weil sw...@redhat.com wrote: On Fri, 15 Aug 2014, Haomai Wang wrote: Hi Kenneth, I don't find valuable info in your logs, it lack of the necessary debug output when accessing crash code. But I scan the encode/decode implementation in GenericObjectMap and find something bad. For example, two oid has same hash and their name is: A: rb.data.123 B: rb-123 In ghobject_t compare level, A B. But GenericObjectMap encode . to %e, so the key in DB is: A: _GHOBJTOSEQ_:blah!51615000!!none!!rb%edata%e123!head B: _GHOBJTOSEQ_:blah!51615000!!none!!rb-123!head A B And it seemed that the escape function is useless and should be disabled. I'm not sure whether Kenneth's problem is touching this bug. Because this scene only occur when the object set is very large and make the two object has same hash value. Kenneth, could you have time to run ceph-kv-store [path-to-osd] list _GHOBJTOSEQ_| grep 6adb1100 -A 100. ceph-kv-store is a debug tool which can be compiled from source. You can clone ceph repo and run ./authongen.sh; ./configure; cd src; make ceph-kvstore-tool. path-to-osd should be /var/lib/ceph/osd-[id]/current/. 6adb1100 is from your verbose log and the next 100 rows should know necessary infos. You can also get ceph-kvstore-tool from the 'ceph-tests' package. Hi sage, do you think we need to provided with upgrade function to fix it? Hmm, we might. This only affects the key/value encoding right? The FileStore is using its own function to map these to file names? Can you open a ticket in the tracker for this? I quickly scan the codes in DBObjectMap. It seemed has the same problem. But because FileStore don't use it for scanning and sorting, so it appears rarely touch the bug(not think over carefully) A issue is open(http://tracker.ceph.com/issues/9143) Thanks! sage On Thu, Aug 14, 2014 at 7:36 PM, Kenneth Waegeman kenneth.waege...@ugent.be wrote: - Message from Haomai Wang haomaiw...@gmail.com - Date: Thu, 14 Aug 2014 19:11:55 +0800 From: Haomai Wang haomaiw...@gmail.com Subject: Re: [ceph-users] ceph cluster inconsistency? To: Kenneth Waegeman kenneth.waege...@ugent.be Could you add config debug_keyvaluestore = 20/20 to the crashed osd and replay the command causing crash? I would like to get more debug infos! Thanks. I included the log in attachment! Thanks! On Thu, Aug 14, 2014 at 4:41 PM, Kenneth Waegeman kenneth.waege...@ugent.be wrote: I have: osd_objectstore = keyvaluestore-dev in the global section of my ceph.conf [root@ceph002 ~]# ceph osd erasure-code-profile get profile11 directory=/usr/lib64/ceph/erasure-code k=8 m=3 plugin=jerasure ruleset-failure-domain=osd technique=reed_sol_van the ecdata pool has this as profile pool 3 'ecdata' erasure size 11 min_size 8 crush_ruleset 2 object_hash rjenkins pg_num 128 pgp_num 128 last_change 161 flags hashpspool stripe_width 4096 ECrule in crushmap rule ecdata { ruleset 2 type erasure min_size 3 max_size 20 step set_chooseleaf_tries 5 step take default-ec step choose indep 0 type osd step emit } root default-ec { id -8 # do not change unnecessarily # weight 140.616 alg straw hash 0 # rjenkins1 item ceph001-ec weight 46.872 item ceph002-ec weight 46.872 item ceph003-ec weight 46.872 ... Cheers! Kenneth - Message from Haomai Wang haomaiw...@gmail.com - Date: Thu, 14 Aug 2014 10:07:50 +0800 From: Haomai Wang haomaiw...@gmail.com Subject: Re: [ceph-users] ceph cluster inconsistency? To: Kenneth Waegeman kenneth.waege...@ugent.be Cc: ceph-users ceph-users@lists.ceph.com Hi Kenneth, Could you give your configuration related to EC and KeyValueStore? Not sure whether it's bug on KeyValueStore On Thu, Aug 14, 2014 at 12:06 AM, Kenneth Waegeman kenneth.waege...@ugent.be wrote: Hi, I was doing some tests with rados bench on a Erasure Coded pool (using keyvaluestore-dev
Re: [ceph-users] RadosGW problems
Hi Marco – In CentOS 6, you also had to edit /etc/httpd/conf.d/fastcgi.conf to turn OFF the fastcgi wrapper. I haven’t tested in v7 yet, but I’d guess it’s required there too: # wrap all fastcgi script calls in suexec FastCgiWrapper Off Give that a try, if you haven’t already – restart httpd and ceph-radosgw afterward. Kurt From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Marco Garcês Sent: Friday, August 15, 2014 12:46 PM To: ceph-users@lists.ceph.com Subject: [ceph-users] RadosGW problems Hi there, I am using CentOS 7 with Ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6), 3 OSD, 3 MON, 1 RadosGW (which also serves as ceph-deploy node) I followed all the instructions in the docs, regarding setting up a basic Ceph cluster, and then followed the one to setup RadosGW. I can't seem to use the Swift interface, and the S3 interface, times out after 30 seconds. [Fri Aug 15 18:25:33.290877 2014] [:error] [pid 6197] [client 10.5.5.222:58051http://10.5.5.222:58051] FastCGI: comm with server /var/www/cgi-bin/s3gw.fcgi aborted: idle timeout (30 sec) [Fri Aug 15 18:25:33.291781 2014] [:error] [pid 6197] [client 10.5.5.222:58051http://10.5.5.222:58051] FastCGI: incomplete headers (0 bytes) received from server /var/www/cgi-bin/s3gw.fcgi My ceph.conf: [global] fsid = 581bcd61-8760-4756-a7c8-e8275c0957ad mon_initial_members = CEPH01, CEPH02, CEPH03 mon_host = 10.2.27.81,10.2.27.82,10.2.27.83 public network = 10.2.27.0/25http://10.2.27.0/25 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx filestore_xattr_use_omap = true osd pool default size = 2 osd pool default pg num = 333 osd pool default pgp num = 333 osd journal size = 1024 [client.radosgw.gateway] host = GATEWAY keyring = /etc/ceph/ceph.client.radosgw.keyring rgw socket path = /var/run/ceph/ceph.radosgw.gateway.fastcgi.sock log file = /var/log/ceph/client.radosgw.gateway.log rgw print continue = false rgw enable ops log = true My apache rgw.conf: FastCgiExternalServer /var/www/cgi-bin/s3gw.fcgi -socket /var/run/ceph/ceph.radosgw.gateway.fastcgi.sock VirtualHost *:443 SSLEngine on SSLCertificateFile /etc/pki/tls/certs/ca_rgw.crt SSLCertificateKeyFile /etc/pki/tls/private/ca_rgw.key SetEnv SERVER_PORT_SECURE 443 ServerName gateway.testes.local ServerAlias *.gateway.testes.local ServerAdmin marco.gar...@testes.co.mzmailto:marco.gar...@testes.co.mz DocumentRoot /var/www/cgi-bin RewriteEngine On #RewriteRule ^/(.*) /s3gw.fcgi?%{QUERY_STRING} [E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L] RewriteRule ^/([a-zA-Z0-9-_.]*)([/]?.*) /s3gw.fcgi?page=$1params=$2%{QUERY_STRING} [E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L] IfModule mod_fastcgi.c Directory /var/www Options +ExecCGI AllowOverride All SetHandler fastcgi-script Order allow,deny Allow from all AuthBasicAuthoritative Off /Directory /IfModule AllowEncodedSlashes On ErrorLog /var/log/httpd/error_rgw_ssl.log CustomLog /var/log/httpd/access_rgw_ssl.log combined ServerSignature Off /VirtualHost My /var/www/cgi-bin/s3gw.fcgi #!/bin/sh exec /usr/bin/radosgw -c /etc/ceph/ceph.conf -n client.radosgw.gateway My Rados user: radosgw-admin user info --uid=johndoe { user_id: johndoe, display_name: John Doe, email: j...@example.commailto:j...@example.com, suspended: 0, max_buckets: 1000, auid: 0, subusers: [ { id: johndoe:swift, permissions: full-control}], keys: [ { user: johndoe:swift, access_key: 265DJESOJGSK953EE4LE, secret_key: }, { user: johndoe, access_key: U4AR5757MCON3AZYAB97, secret_key: 05rg47Oa+njo8uxTeX+urBPF0ZRPWvVq8nfrC5cN}], swift_keys: [ { user: johndoe:swift, secret_key: Lags5xwX5aiPgkG\/QqA8HygKs6AQYO46dBXS0ZGS}], caps: [], op_mask: read, write, delete, default_placement: , placement_tags: [], bucket_quota: { enabled: false, max_size_kb: -1, max_objects: -1}, user_quota: { enabled: false, max_size_kb: -1, max_objects: -1}, temp_url_keys: []} I can reach https://gateway.testes.local, and I can login with S3, but cant login with Swift (using Cyberduck). Also, I can create buckets with S3, but if I upload a file, it times out with the error above. There is a necessity to use both the S3 and Swift API. Can you help me? Thank you in advance, regards, Marco Garcês ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RadosGW problems
On Mon 18 Aug 2014 12:45:33 AM AST, Bachelder, Kurt wrote: Hi Marco – In CentOS 6, you also had to edit /etc/httpd/conf.d/fastcgi.conf to turn OFF the fastcgi wrapper. I haven’t tested in v7 yet, but I’d guess it’s required there too: # wrap all fastcgi script calls in suexec FastCgiWrapper Off Give that a try, if you haven’t already – restart httpd and ceph-radosgw afterward. Kurt *From:*ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf Of *Marco Garcês *Sent:* Friday, August 15, 2014 12:46 PM *To:* ceph-users@lists.ceph.com *Subject:* [ceph-users] RadosGW problems Hi there, I am using CentOS 7 with Ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6), 3 OSD, 3 MON, 1 RadosGW (which also serves as ceph-deploy node) I followed all the instructions in the docs, regarding setting up a basic Ceph cluster, and then followed the one to setup RadosGW. I can't seem to use the Swift interface, and the S3 interface, times out after 30 seconds. [Fri Aug 15 18:25:33.290877 2014] [:error] [pid 6197] [client 10.5.5.222:58051 http://10.5.5.222:58051] FastCGI: comm with server /var/www/cgi-bin/s3gw.fcgi aborted: idle timeout (30 sec) [Fri Aug 15 18:25:33.291781 2014] [:error] [pid 6197] [client 10.5.5.222:58051 http://10.5.5.222:58051] FastCGI: incomplete headers (0 bytes) received from server /var/www/cgi-bin/s3gw.fcgi *My ceph.conf:* [global] fsid = 581bcd61-8760-4756-a7c8-e8275c0957ad mon_initial_members = CEPH01, CEPH02, CEPH03 mon_host = 10.2.27.81,10.2.27.82,10.2.27.83 public network = 10.2.27.0/25 http://10.2.27.0/25 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx filestore_xattr_use_omap = true osd pool default size = 2 osd pool default pg num = 333 osd pool default pgp num = 333 osd journal size = 1024 [client.radosgw.gateway] host = GATEWAY keyring = /etc/ceph/ceph.client.radosgw.keyring rgw socket path = /var/run/ceph/ceph.radosgw.gateway.fastcgi.sock log file = /var/log/ceph/client.radosgw.gateway.log rgw print continue = false rgw enable ops log = true *My apache rgw.conf:* FastCgiExternalServer /var/www/cgi-bin/s3gw.fcgi -socket /var/run/ceph/ceph.radosgw.gateway.fastcgi.sock VirtualHost *:443 SSLEngine on SSLCertificateFile /etc/pki/tls/certs/ca_rgw.crt SSLCertificateKeyFile /etc/pki/tls/private/ca_rgw.key SetEnv SERVER_PORT_SECURE 443 ServerName gateway.testes.local ServerAlias *.gateway.testes.local ServerAdmin marco.gar...@testes.co.mz mailto:marco.gar...@testes.co.mz DocumentRoot /var/www/cgi-bin RewriteEngine On #RewriteRule ^/(.*) /s3gw.fcgi?%{QUERY_STRING} [E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L] RewriteRule ^/([a-zA-Z0-9-_.]*)([/]?.*) /s3gw.fcgi?page=$1params=$2%{QUERY_STRING} [E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L] IfModule mod_fastcgi.c Directory /var/www Options +ExecCGI AllowOverride All SetHandler fastcgi-script Order allow,deny Allow from all AuthBasicAuthoritative Off /Directory /IfModule AllowEncodedSlashes On ErrorLog /var/log/httpd/error_rgw_ssl.log CustomLog /var/log/httpd/access_rgw_ssl.log combined ServerSignature Off /VirtualHost *My /var/www/cgi-bin/s3gw.fcgi * #!/bin/sh exec /usr/bin/radosgw -c /etc/ceph/ceph.conf -n client.radosgw.gateway *My Rados user:* radosgw-admin user info --uid=johndoe { user_id: johndoe, display_name: John Doe, email: j...@example.com mailto:j...@example.com, suspended: 0, max_buckets: 1000, auid: 0, subusers: [ { id: johndoe:swift, permissions: full-control}], keys: [ { user: johndoe:swift, access_key: 265DJESOJGSK953EE4LE, secret_key: }, { user: johndoe, access_key: U4AR5757MCON3AZYAB97, secret_key: 05rg47Oa+njo8uxTeX+urBPF0ZRPWvVq8nfrC5cN}], swift_keys: [ { user: johndoe:swift, secret_key: Lags5xwX5aiPgkG\/QqA8HygKs6AQYO46dBXS0ZGS}], caps: [], op_mask: read, write, delete, default_placement: , placement_tags: [], bucket_quota: { enabled: false, max_size_kb: -1, max_objects: -1}, user_quota: { enabled: false, max_size_kb: -1, max_objects: -1}, temp_url_keys: []} I can reach https://gateway.testes.local, and I can login with S3, but cant login with Swift (using Cyberduck). Also, I can create buckets with S3, but if I upload a file, it times out with the error above. There is a necessity to use both the S3 and Swift API. Can you help me? Thank you in advance, regards, Marco Garcês ___ ceph-users mailing list ceph-users@lists.ceph.com
Re: [ceph-users] cache pools on hypervisor servers
Thanks a lot for your input. I will proceed with putting the cache pool on the storage layer instead. Andrei - Original Message - From: Sage Weil sw...@redhat.com To: Andrei Mikhailovsky and...@arhont.com Cc: Robert van Leeuwen robert.vanleeu...@spilgames.com, ceph-users@lists.ceph.com Sent: Thursday, 14 August, 2014 6:33:25 PM Subject: Re: [ceph-users] cache pools on hypervisor servers On Thu, 14 Aug 2014, Andrei Mikhailovsky wrote: Hi guys, Could someone from the ceph team please comment on running osd cache pool on the hypervisors? Is this a good idea, or will it create a lot of performance issues? It doesn't sound like an especially good idea. In general you want the cache pool to be significantly faster than the base pool (think PCI attached flash). And there won't be any particular affinity to the host where the VM consuming the sotrage happens to be, so I don't think there is a reason to put the flash in the hypervisor nodes unless there simply isn't anywhere else to put them. Probably what you're after is a client-side write-thru cache? There is some ongoing work to build this into qemu and possibly librbd, but nothing is ready yet that I know of. sage Anyone in the ceph community that has done this? Any results to share? Many thanks Andrei From: Robert van Leeuwen robert.vanleeu...@spilgames.com To: Andrei Mikhailovsky and...@arhont.com Cc: ceph-users@lists.ceph.com Sent: Thursday, 14 August, 2014 9:31:24 AM Subject: RE: cache pools on hypervisor servers Personally I am not worried too much about the hypervisor - hypervisor traffic as I am using a dedicated infiniband network for storage. It is not used for the guest to guest or the internet traffic or anything else. I would like to decrease or at least smooth out the traffic peaks between the hypervisors and the SAS/SATA osd storage servers. I guess the ssd cache pool would enable me to do that as the eviction rate should be more structured compared to the random io writes that guest vms generate. Sounds reasonable I'm very interested in the effect of caching pools in combination with running VMs on them so I'd be happy to hear what you find ;) I will give it a try and share back the results when we get the ssd kit. Excellent, looking forward to it. As a side note: Running OSDs on hypervisors would not be my preferred choice since hypervisor load might impact Ceph performance. Do you think it is not a good idea even if you have a lot of cores on the hypervisors? Like 24 or 32 per host server? According to my monitoring, our osd servers are not that stressed and generally have over 50% of free cpu power. The number of cores do not really matter if they are all busy ;) I honestly do not know how Ceph behaves when it is CPU starved but I guess it might not be pretty. Since your whole environment will be crumbling down if your storage becomes unavailable it is not a risk I would take lightly. Cheers, Robert van Leeuwen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com