Re: [ceph-users] Help with the Hammer to Jewel upgrade procedure without loosing write access to the buckets

2017-01-26 Thread George Mihaiescu
Hi Mohammed,

Thanks for the hint, I think I remember seeing this when Jewel came out but I 
assumed it must be a mistake, or a mere recommendation but not a mandatory 
requirement because I always upgraded the OSDs last ones.

Today I upgraded my OSD nodes in the test environment to Jewel and regained 
write access to the buckets.

In production we have multiple RGW nodes behind load balancers, so we can 
upgrade them one at a time.

If we have to upgrade first all OSD nodes (which takes much longer considering 
they are many more) while the old Hammer RGW cannot talk to a Jewel cluster, 
then it means one cannot perform a live upgrade of Ceph, which I think breaks 
the promise of a large, distributed, always on storage system...

Now I'll have to test what happens with the cinder volumes attached to a Hammer 
cluster that's being upgraded to Jewel, and if upgrading the Ceph packages on 
the compute nodes to Jewel will require a restart of the VMs or reboot of the 
servers.

Thank you again for your help,
George


> On Jan 25, 2017, at 19:10, Mohammed Naser  wrote:
> 
> George,
> 
> I believe the supported upgrade model is monitors, OSDs, metadata servers and 
> object gateways finally.
> 
> I would suggest trying to support path, if you’re still having issues *with* 
> the correct upgrade sequence, I would look further into it
> 
> Thanks
> Mohammed
> 
>> On Jan 25, 2017, at 6:24 PM, George Mihaiescu  wrote:
>> 
>> 
>> Hi,
>> 
>> I need your help with upgrading our cluster from Hammer (last version) to 
>> Jewel 10.2.5 without loosing write access to Radosgw.
>> 
>> We have a fairly large cluster (4.3 PB raw) mostly used to store large S3 
>> objects, and we currently have more than 500 TB of data in the 
>> ".rgw.buckets" pool, so I'm very cautious about upgrading it to Jewel. 
>> The plan is to upgrade Ceph-mon and Radosgw to 10.2.5, while keeping the OSD 
>> nodes on Hammer, then slowly update them as well.
>> 
>> 
>> I am currently testing the upgrade procedure in a lab environment, but once 
>> I update ceph-mon and radosgw to Jewel, I cannot upload files into new or 
>> existing buckets anymore, but I can still create new buckets.
>> 
>> 
>> I read [1], [2], [3] and [4] and even ran the script in [4] as it can be 
>> seen below, but still cannot upload new objects.
>> 
>> I was hoping that if I wait long enough to update from Hammer to Jewel, most 
>> of the big issues will be solved by point releases, but it seems that I'm 
>> doing something wrong, probably because of lack of up to date documentation.
>> 
>> 
>> 
>> After the update to Jewel, this is how things look in my test environment.
>> 
>> root@ceph-mon1:~# radosgw zonegroup get
>> 
>> root@ceph-mon1:~# radosgw-admin period get
>> period init failed: (2) No such file or directory
>> 2017-01-25 10:13:06.941018 7f98f0d13900  0 RGWPeriod::init failed to init 
>> realm  id  : (2) No such file or directory
>> 
>> root@ceph-mon1:~# radosgw-admin zonegroup get
>> failed to init zonegroup: (2) No such file or directory
>> 
>> root@ceph-mon1:~# ceph --version
>> ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367)
>> 
>> root@ceph-mon1:~# radosgw-admin realm list
>> {
>> "default_info": "",
>> "realms": []
>> }
>> 
>> root@ceph-mon1:~# radosgw-admin period list
>> {
>> "periods": []
>> }
>> 
>> root@ceph-mon1:~# radosgw-admin period get
>> period init failed: (2) No such file or directory
>> 2017-01-25 12:26:07.217986 7f97ca82e900  0 RGWPeriod::init failed to init 
>> realm  id  : (2) No such file or directory
>> 
>> root@ceph-mon1:~# radosgw-admin zonegroup get --rgw-zonegroup=default
>> {
>> "id": "default",
>> "name": "default",
>> "api_name": "",
>> "is_master": "true",
>> "endpoints": [],
>> "hostnames": [],
>> "hostnames_s3website": [],
>> "master_zone": "default",
>> "zones": [
>> {
>> "id": "default",
>> "name": "default",
>> "endpoints": [],
>> "log_meta": "false",
>> "log_data": "false",
>> "bucket_index_max_shards": 0,
>> "read_only": "false"
>> }
>> ],
>> "placement_targets": [
>> {
>> "name": "default-placement",
>> "tags": []
>> }
>> ],
>> "default_placement": "default-placement",
>> "realm_id": ""
>> }
>> 
>> root@ceph-mon1:~# radosgw-admin zone get --zone-id=default
>> {
>> "id": "default",
>> "name": "default",
>> "domain_root": ".rgw",
>> "control_pool": ".rgw.control",
>> "gc_pool": ".rgw.gc",
>> "log_pool": ".log",
>> "intent_log_pool": ".intent-log",
>> "usage_log_pool": ".usage",
>> "user_keys_pool": ".users",
>> "user_email_pool": ".users.email",
>> "user_swift_pool": ".users.swift",
>> "user_uid_pool": ".users.uid",
>> "system_key": {
>> "access_key": "",
>> "secret_key": ""
>> },
>> "placement_pools": [
>> {
>> "key

Re: [ceph-users] Help with the Hammer to Jewel upgrade procedure without loosing write access to the buckets

2017-01-25 Thread Mohammed Naser
George,

I believe the supported upgrade model is monitors, OSDs, metadata servers and 
object gateways finally.

I would suggest trying to support path, if you’re still having issues *with* 
the correct upgrade sequence, I would look further into it

Thanks
Mohammed

> On Jan 25, 2017, at 6:24 PM, George Mihaiescu  wrote:
> 
> 
> Hi,
> 
> I need your help with upgrading our cluster from Hammer (last version) to 
> Jewel 10.2.5 without loosing write access to Radosgw.
> 
> We have a fairly large cluster (4.3 PB raw) mostly used to store large S3 
> objects, and we currently have more than 500 TB of data in the ".rgw.buckets" 
> pool, so I'm very cautious about upgrading it to Jewel. 
> The plan is to upgrade Ceph-mon and Radosgw to 10.2.5, while keeping the OSD 
> nodes on Hammer, then slowly update them as well.
> 
> 
> I am currently testing the upgrade procedure in a lab environment, but once I 
> update ceph-mon and radosgw to Jewel, I cannot upload files into new or 
> existing buckets anymore, but I can still create new buckets.
> 
> 
> I read [1], [2], [3] and [4] and even ran the script in [4] as it can be seen 
> below, but still cannot upload new objects.
> 
> I was hoping that if I wait long enough to update from Hammer to Jewel, most 
> of the big issues will be solved by point releases, but it seems that I'm 
> doing something wrong, probably because of lack of up to date documentation.
> 
> 
> 
> After the update to Jewel, this is how things look in my test environment.
> 
> root@ceph-mon1:~# radosgw zonegroup get
> 
> root@ceph-mon1:~# radosgw-admin period get
> period init failed: (2) No such file or directory
> 2017-01-25 10:13:06.941018 7f98f0d13900  0 RGWPeriod::init failed to init 
> realm  id  : (2) No such file or directory
> 
> root@ceph-mon1:~# radosgw-admin zonegroup get
> failed to init zonegroup: (2) No such file or directory
> 
> root@ceph-mon1:~# ceph --version
> ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367)
> 
> root@ceph-mon1:~# radosgw-admin realm list
> {
> "default_info": "",
> "realms": []
> }
> 
> root@ceph-mon1:~# radosgw-admin period list
> {
> "periods": []
> }
> 
> root@ceph-mon1:~# radosgw-admin period get
> period init failed: (2) No such file or directory
> 2017-01-25 12:26:07.217986 7f97ca82e900  0 RGWPeriod::init failed to init 
> realm  id  : (2) No such file or directory
> 
> root@ceph-mon1:~# radosgw-admin zonegroup get --rgw-zonegroup=default
> {
> "id": "default",
> "name": "default",
> "api_name": "",
> "is_master": "true",
> "endpoints": [],
> "hostnames": [],
> "hostnames_s3website": [],
> "master_zone": "default",
> "zones": [
> {
> "id": "default",
> "name": "default",
> "endpoints": [],
> "log_meta": "false",
> "log_data": "false",
> "bucket_index_max_shards": 0,
> "read_only": "false"
> }
> ],
> "placement_targets": [
> {
> "name": "default-placement",
> "tags": []
> }
> ],
> "default_placement": "default-placement",
> "realm_id": ""
> }
> 
> root@ceph-mon1:~# radosgw-admin zone get --zone-id=default
> {
> "id": "default",
> "name": "default",
> "domain_root": ".rgw",
> "control_pool": ".rgw.control",
> "gc_pool": ".rgw.gc",
> "log_pool": ".log",
> "intent_log_pool": ".intent-log",
> "usage_log_pool": ".usage",
> "user_keys_pool": ".users",
> "user_email_pool": ".users.email",
> "user_swift_pool": ".users.swift",
> "user_uid_pool": ".users.uid",
> "system_key": {
> "access_key": "",
> "secret_key": ""
> },
> "placement_pools": [
> {
> "key": "default-placement",
> "val": {
> "index_pool": ".rgw.buckets.index",
> "data_pool": ".rgw.buckets",
> "data_extra_pool": ".rgw.buckets.extra",
> "index_type": 0
> }
> }
> ],
> "metadata_heap": ".rgw.meta",
> "realm_id": ""
> }
> 
> root@ceph-mon1:~# rados df
> pool name KB  objects   clones degraded  
> unfound   rdrd KB   wrwr KB
> .log   0  12700   
>  04140241275414020
> .rgw   4   1400   
>  0  147  117   35   14
> .rgw.buckets   11635400   
>  04 4969   3811637
> .rgw.buckets.index0   5600
> 0 1871 1815  1190
> .rgw.control   0800   
>  00000
> .rgw.gc 

[ceph-users] Help with the Hammer to Jewel upgrade procedure without loosing write access to the buckets

2017-01-25 Thread George Mihaiescu
Hi,

I need your help with upgrading our cluster from Hammer (last version) to
Jewel 10.2.5 without loosing write access to Radosgw.

We have a fairly large cluster (4.3 PB raw) mostly used to store large S3
objects, and we currently have more than 500 TB of data in the
".rgw.buckets" pool, so I'm very cautious about upgrading it to Jewel.
The plan is to upgrade Ceph-mon and Radosgw to 10.2.5, while keeping the
OSD nodes on Hammer, then slowly update them as well.


I am currently testing the upgrade procedure in a lab environment, but once
I update ceph-mon and radosgw to Jewel, I cannot upload files into new or
existing buckets anymore, but I can still create new buckets.


I read [1], [2], [3] and [4] and even ran the script in [4] as it can be
seen below, but still cannot upload new objects.

I was hoping that if I wait long enough to update from Hammer to Jewel,
most of the big issues will be solved by point releases, but it seems that
I'm doing something wrong, probably because of lack of up to date
documentation.



After the update to Jewel, this is how things look in my test environment.

root@ceph-mon1:~# radosgw zonegroup get

root@ceph-mon1:~# radosgw-admin period get
period init failed: (2) No such file or directory
2017-01-25 10:13:06.941018 7f98f0d13900  0 RGWPeriod::init failed to init
realm  id  : (2) No such file or directory

root@ceph-mon1:~# radosgw-admin zonegroup get
failed to init zonegroup: (2) No such file or directory

root@ceph-mon1:~# ceph --version
ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367)

root@ceph-mon1:~# radosgw-admin realm list
{
"default_info": "",
"realms": []
}

root@ceph-mon1:~# radosgw-admin period list
{
"periods": []
}

root@ceph-mon1:~# radosgw-admin period get
period init failed: (2) No such file or directory
2017-01-25 12:26:07.217986 7f97ca82e900  0 RGWPeriod::init failed to init
realm  id  : (2) No such file or directory

root@ceph-mon1:~# radosgw-admin zonegroup get --rgw-zonegroup=default
{
"id": "default",
"name": "default",
"api_name": "",
"is_master": "true",
"endpoints": [],
"hostnames": [],
"hostnames_s3website": [],
"master_zone": "default",
"zones": [
{
"id": "default",
"name": "default",
"endpoints": [],
"log_meta": "false",
"log_data": "false",
"bucket_index_max_shards": 0,
"read_only": "false"
}
],
"placement_targets": [
{
"name": "default-placement",
"tags": []
}
],
"default_placement": "default-placement",
"realm_id": ""
}

root@ceph-mon1:~# radosgw-admin zone get --zone-id=default
{
"id": "default",
"name": "default",
"domain_root": ".rgw",
"control_pool": ".rgw.control",
"gc_pool": ".rgw.gc",
"log_pool": ".log",
"intent_log_pool": ".intent-log",
"usage_log_pool": ".usage",
"user_keys_pool": ".users",
"user_email_pool": ".users.email",
"user_swift_pool": ".users.swift",
"user_uid_pool": ".users.uid",
"system_key": {
"access_key": "",
"secret_key": ""
},
"placement_pools": [
{
"key": "default-placement",
"val": {
"index_pool": ".rgw.buckets.index",
"data_pool": ".rgw.buckets",
"data_extra_pool": ".rgw.buckets.extra",
"index_type": 0
}
}
],
"metadata_heap": ".rgw.meta",
"realm_id": ""
}

root@ceph-mon1:~# rados df
pool name KB  objects   clones degraded
unfound   rdrd KB   wrwr KB
.log   0  1270
004140241275414020
.rgw   4   140
00  147  117   35   14
.rgw.buckets   1163540
004 4969   3811637
.rgw.buckets.index0   560
00 1871 1815  1190
.rgw.control   080
000000
.rgw.gc0   320
00 5214 5182 35190
.rgw.meta  280
0000   208
.rgw.root  240
00   72   48   128
.usage 020
00   87   87  1740
.users.uid 140
00  104   96   442
rbd000
000