Re: [ceph-users] Help with the Hammer to Jewel upgrade procedure without loosing write access to the buckets
Hi Mohammed, Thanks for the hint, I think I remember seeing this when Jewel came out but I assumed it must be a mistake, or a mere recommendation but not a mandatory requirement because I always upgraded the OSDs last ones. Today I upgraded my OSD nodes in the test environment to Jewel and regained write access to the buckets. In production we have multiple RGW nodes behind load balancers, so we can upgrade them one at a time. If we have to upgrade first all OSD nodes (which takes much longer considering they are many more) while the old Hammer RGW cannot talk to a Jewel cluster, then it means one cannot perform a live upgrade of Ceph, which I think breaks the promise of a large, distributed, always on storage system... Now I'll have to test what happens with the cinder volumes attached to a Hammer cluster that's being upgraded to Jewel, and if upgrading the Ceph packages on the compute nodes to Jewel will require a restart of the VMs or reboot of the servers. Thank you again for your help, George > On Jan 25, 2017, at 19:10, Mohammed Naser wrote: > > George, > > I believe the supported upgrade model is monitors, OSDs, metadata servers and > object gateways finally. > > I would suggest trying to support path, if you’re still having issues *with* > the correct upgrade sequence, I would look further into it > > Thanks > Mohammed > >> On Jan 25, 2017, at 6:24 PM, George Mihaiescu wrote: >> >> >> Hi, >> >> I need your help with upgrading our cluster from Hammer (last version) to >> Jewel 10.2.5 without loosing write access to Radosgw. >> >> We have a fairly large cluster (4.3 PB raw) mostly used to store large S3 >> objects, and we currently have more than 500 TB of data in the >> ".rgw.buckets" pool, so I'm very cautious about upgrading it to Jewel. >> The plan is to upgrade Ceph-mon and Radosgw to 10.2.5, while keeping the OSD >> nodes on Hammer, then slowly update them as well. >> >> >> I am currently testing the upgrade procedure in a lab environment, but once >> I update ceph-mon and radosgw to Jewel, I cannot upload files into new or >> existing buckets anymore, but I can still create new buckets. >> >> >> I read [1], [2], [3] and [4] and even ran the script in [4] as it can be >> seen below, but still cannot upload new objects. >> >> I was hoping that if I wait long enough to update from Hammer to Jewel, most >> of the big issues will be solved by point releases, but it seems that I'm >> doing something wrong, probably because of lack of up to date documentation. >> >> >> >> After the update to Jewel, this is how things look in my test environment. >> >> root@ceph-mon1:~# radosgw zonegroup get >> >> root@ceph-mon1:~# radosgw-admin period get >> period init failed: (2) No such file or directory >> 2017-01-25 10:13:06.941018 7f98f0d13900 0 RGWPeriod::init failed to init >> realm id : (2) No such file or directory >> >> root@ceph-mon1:~# radosgw-admin zonegroup get >> failed to init zonegroup: (2) No such file or directory >> >> root@ceph-mon1:~# ceph --version >> ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367) >> >> root@ceph-mon1:~# radosgw-admin realm list >> { >> "default_info": "", >> "realms": [] >> } >> >> root@ceph-mon1:~# radosgw-admin period list >> { >> "periods": [] >> } >> >> root@ceph-mon1:~# radosgw-admin period get >> period init failed: (2) No such file or directory >> 2017-01-25 12:26:07.217986 7f97ca82e900 0 RGWPeriod::init failed to init >> realm id : (2) No such file or directory >> >> root@ceph-mon1:~# radosgw-admin zonegroup get --rgw-zonegroup=default >> { >> "id": "default", >> "name": "default", >> "api_name": "", >> "is_master": "true", >> "endpoints": [], >> "hostnames": [], >> "hostnames_s3website": [], >> "master_zone": "default", >> "zones": [ >> { >> "id": "default", >> "name": "default", >> "endpoints": [], >> "log_meta": "false", >> "log_data": "false", >> "bucket_index_max_shards": 0, >> "read_only": "false" >> } >> ], >> "placement_targets": [ >> { >> "name": "default-placement", >> "tags": [] >> } >> ], >> "default_placement": "default-placement", >> "realm_id": "" >> } >> >> root@ceph-mon1:~# radosgw-admin zone get --zone-id=default >> { >> "id": "default", >> "name": "default", >> "domain_root": ".rgw", >> "control_pool": ".rgw.control", >> "gc_pool": ".rgw.gc", >> "log_pool": ".log", >> "intent_log_pool": ".intent-log", >> "usage_log_pool": ".usage", >> "user_keys_pool": ".users", >> "user_email_pool": ".users.email", >> "user_swift_pool": ".users.swift", >> "user_uid_pool": ".users.uid", >> "system_key": { >> "access_key": "", >> "secret_key": "" >> }, >> "placement_pools": [ >> { >> "key
Re: [ceph-users] Help with the Hammer to Jewel upgrade procedure without loosing write access to the buckets
George, I believe the supported upgrade model is monitors, OSDs, metadata servers and object gateways finally. I would suggest trying to support path, if you’re still having issues *with* the correct upgrade sequence, I would look further into it Thanks Mohammed > On Jan 25, 2017, at 6:24 PM, George Mihaiescu wrote: > > > Hi, > > I need your help with upgrading our cluster from Hammer (last version) to > Jewel 10.2.5 without loosing write access to Radosgw. > > We have a fairly large cluster (4.3 PB raw) mostly used to store large S3 > objects, and we currently have more than 500 TB of data in the ".rgw.buckets" > pool, so I'm very cautious about upgrading it to Jewel. > The plan is to upgrade Ceph-mon and Radosgw to 10.2.5, while keeping the OSD > nodes on Hammer, then slowly update them as well. > > > I am currently testing the upgrade procedure in a lab environment, but once I > update ceph-mon and radosgw to Jewel, I cannot upload files into new or > existing buckets anymore, but I can still create new buckets. > > > I read [1], [2], [3] and [4] and even ran the script in [4] as it can be seen > below, but still cannot upload new objects. > > I was hoping that if I wait long enough to update from Hammer to Jewel, most > of the big issues will be solved by point releases, but it seems that I'm > doing something wrong, probably because of lack of up to date documentation. > > > > After the update to Jewel, this is how things look in my test environment. > > root@ceph-mon1:~# radosgw zonegroup get > > root@ceph-mon1:~# radosgw-admin period get > period init failed: (2) No such file or directory > 2017-01-25 10:13:06.941018 7f98f0d13900 0 RGWPeriod::init failed to init > realm id : (2) No such file or directory > > root@ceph-mon1:~# radosgw-admin zonegroup get > failed to init zonegroup: (2) No such file or directory > > root@ceph-mon1:~# ceph --version > ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367) > > root@ceph-mon1:~# radosgw-admin realm list > { > "default_info": "", > "realms": [] > } > > root@ceph-mon1:~# radosgw-admin period list > { > "periods": [] > } > > root@ceph-mon1:~# radosgw-admin period get > period init failed: (2) No such file or directory > 2017-01-25 12:26:07.217986 7f97ca82e900 0 RGWPeriod::init failed to init > realm id : (2) No such file or directory > > root@ceph-mon1:~# radosgw-admin zonegroup get --rgw-zonegroup=default > { > "id": "default", > "name": "default", > "api_name": "", > "is_master": "true", > "endpoints": [], > "hostnames": [], > "hostnames_s3website": [], > "master_zone": "default", > "zones": [ > { > "id": "default", > "name": "default", > "endpoints": [], > "log_meta": "false", > "log_data": "false", > "bucket_index_max_shards": 0, > "read_only": "false" > } > ], > "placement_targets": [ > { > "name": "default-placement", > "tags": [] > } > ], > "default_placement": "default-placement", > "realm_id": "" > } > > root@ceph-mon1:~# radosgw-admin zone get --zone-id=default > { > "id": "default", > "name": "default", > "domain_root": ".rgw", > "control_pool": ".rgw.control", > "gc_pool": ".rgw.gc", > "log_pool": ".log", > "intent_log_pool": ".intent-log", > "usage_log_pool": ".usage", > "user_keys_pool": ".users", > "user_email_pool": ".users.email", > "user_swift_pool": ".users.swift", > "user_uid_pool": ".users.uid", > "system_key": { > "access_key": "", > "secret_key": "" > }, > "placement_pools": [ > { > "key": "default-placement", > "val": { > "index_pool": ".rgw.buckets.index", > "data_pool": ".rgw.buckets", > "data_extra_pool": ".rgw.buckets.extra", > "index_type": 0 > } > } > ], > "metadata_heap": ".rgw.meta", > "realm_id": "" > } > > root@ceph-mon1:~# rados df > pool name KB objects clones degraded > unfound rdrd KB wrwr KB > .log 0 12700 > 04140241275414020 > .rgw 4 1400 > 0 147 117 35 14 > .rgw.buckets 11635400 > 04 4969 3811637 > .rgw.buckets.index0 5600 > 0 1871 1815 1190 > .rgw.control 0800 > 00000 > .rgw.gc
[ceph-users] Help with the Hammer to Jewel upgrade procedure without loosing write access to the buckets
Hi, I need your help with upgrading our cluster from Hammer (last version) to Jewel 10.2.5 without loosing write access to Radosgw. We have a fairly large cluster (4.3 PB raw) mostly used to store large S3 objects, and we currently have more than 500 TB of data in the ".rgw.buckets" pool, so I'm very cautious about upgrading it to Jewel. The plan is to upgrade Ceph-mon and Radosgw to 10.2.5, while keeping the OSD nodes on Hammer, then slowly update them as well. I am currently testing the upgrade procedure in a lab environment, but once I update ceph-mon and radosgw to Jewel, I cannot upload files into new or existing buckets anymore, but I can still create new buckets. I read [1], [2], [3] and [4] and even ran the script in [4] as it can be seen below, but still cannot upload new objects. I was hoping that if I wait long enough to update from Hammer to Jewel, most of the big issues will be solved by point releases, but it seems that I'm doing something wrong, probably because of lack of up to date documentation. After the update to Jewel, this is how things look in my test environment. root@ceph-mon1:~# radosgw zonegroup get root@ceph-mon1:~# radosgw-admin period get period init failed: (2) No such file or directory 2017-01-25 10:13:06.941018 7f98f0d13900 0 RGWPeriod::init failed to init realm id : (2) No such file or directory root@ceph-mon1:~# radosgw-admin zonegroup get failed to init zonegroup: (2) No such file or directory root@ceph-mon1:~# ceph --version ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367) root@ceph-mon1:~# radosgw-admin realm list { "default_info": "", "realms": [] } root@ceph-mon1:~# radosgw-admin period list { "periods": [] } root@ceph-mon1:~# radosgw-admin period get period init failed: (2) No such file or directory 2017-01-25 12:26:07.217986 7f97ca82e900 0 RGWPeriod::init failed to init realm id : (2) No such file or directory root@ceph-mon1:~# radosgw-admin zonegroup get --rgw-zonegroup=default { "id": "default", "name": "default", "api_name": "", "is_master": "true", "endpoints": [], "hostnames": [], "hostnames_s3website": [], "master_zone": "default", "zones": [ { "id": "default", "name": "default", "endpoints": [], "log_meta": "false", "log_data": "false", "bucket_index_max_shards": 0, "read_only": "false" } ], "placement_targets": [ { "name": "default-placement", "tags": [] } ], "default_placement": "default-placement", "realm_id": "" } root@ceph-mon1:~# radosgw-admin zone get --zone-id=default { "id": "default", "name": "default", "domain_root": ".rgw", "control_pool": ".rgw.control", "gc_pool": ".rgw.gc", "log_pool": ".log", "intent_log_pool": ".intent-log", "usage_log_pool": ".usage", "user_keys_pool": ".users", "user_email_pool": ".users.email", "user_swift_pool": ".users.swift", "user_uid_pool": ".users.uid", "system_key": { "access_key": "", "secret_key": "" }, "placement_pools": [ { "key": "default-placement", "val": { "index_pool": ".rgw.buckets.index", "data_pool": ".rgw.buckets", "data_extra_pool": ".rgw.buckets.extra", "index_type": 0 } } ], "metadata_heap": ".rgw.meta", "realm_id": "" } root@ceph-mon1:~# rados df pool name KB objects clones degraded unfound rdrd KB wrwr KB .log 0 1270 004140241275414020 .rgw 4 140 00 147 117 35 14 .rgw.buckets 1163540 004 4969 3811637 .rgw.buckets.index0 560 00 1871 1815 1190 .rgw.control 080 000000 .rgw.gc0 320 00 5214 5182 35190 .rgw.meta 280 0000 208 .rgw.root 240 00 72 48 128 .usage 020 00 87 87 1740 .users.uid 140 00 104 96 442 rbd000 000