[ceph-users] Re: [object gateway] setting storage class does not move object to correct backing pool?

2019-12-10 Thread Matt Benjamin
Hi Gerdriaan,

I think actually moving an already-stored object requires a lifecycle
transition policy.  Assuming such a policy exists and matches the
object by prefix/tag/time, it would migrate during an (hopefully the
first) eligible lc processing window.

Matt

On Tue, Dec 10, 2019 at 7:44 AM Gerdriaan Mulder  wrote:
>
> Hi,
>
> If I change the storage class of an object via s3cmd, the object's
> storage class is reported as being changed. However, when inspecting
> where the objects are placed (via `rados -p  ls`, see further on),
> the object seems to be retained in the original pool.
>
> The idea behind this test setup is to simulate two storage locations,
> one based on SSDs or similar flash storage, the other on slow HDDs. We
> want to be able to alter the storage location of objects on the fly,
> typically only from fast to slow storage. The object should then only
> reside on slow storage.
>
> The setup is as follows on Nautilus (Ubuntu 16.04, see
>  for the
> full dump):
>
> 
> root@node1:~# ceph -s
>   health: HEALTH_OK
>
>   mon: 3 daemons, quorum node1,node3,node5 (age 12d)
>   mgr: node2(active, since 6d), standbys: node4
>   osd: 4 osds: 4 up (since 12d), 4 in (since 12d)
>   rgw: 1 daemon active (node1)
>
>   pools:   7 pools, 296 pgs
>   objects: 229 objects, 192 KiB
>   usage:   3.2 GiB used, 6.8 GiB / 10 GiB avail
>   pgs: 296 active+clean
>
> root@node1:~# ceph osd tree
> ID  CLASS WEIGHT  TYPE NAME   STATUS REWEIGHT PRI-AFF
>   -1   0.00970 root default
> -16   0.00970 datacenter nijmegen
>   -3   0.00388 host node2
>0   hdd 0.00388 osd.0   up  1.0 1.0
>   -5   0.00388 host node3
>1   hdd 0.00388 osd.1   up  1.0 1.0
>   -7   0.00098 host node4
>2   ssd 0.00098 osd.2   up  1.0 1.0
>   -9   0.00098 host node5
>3   ssd 0.00098 osd.3   up  1.0 1.0
>
> root@node1:~# ceph osd pool ls detail
> pool 1 'tier1-ssd' replicated size 2 min_size 1 crush_rule 1 object_hash
> rjenkins pg_num 128 pgp_num 128 [snip] application rgw
> pool 2 'tier2-hdd' replicated size 1 min_size 1 crush_rule 2 object_hash
> rjenkins pg_num 128 pgp_num 128 [snip] application rgw
> pool 3 '.rgw.root' replicated size 2 min_size 1 crush_rule 0 object_hash
> rjenkins pg_num 8 pgp_num 8 [snip] application rgw
> pool 4 'default.rgw.control' replicated size 2 min_size 1 crush_rule 0
> [snip] application rgw
> pool 5 'default.rgw.meta' replicated size 2 min_size 1 crush_rule 0
> [snip] application rgw
> pool 6 'default.rgw.log' replicated size 2 min_size 1 crush_rule 0
> [snip] application rgw
> pool 7 'default.rgw.buckets.index' replicated size 3 min_size 2
> crush_rule 0 [snip] application rgw
>
> root@node1:~# ceph osd pool application get# compacted
>tier1-ssd => rgw {}
>tier2-hdd => rgw {}
>.rgw.root => rgw {}
> default.rgw.control => rgw {}
> default.rgw.meta => rgw {}
> default.rgw.log => rgw {}
> default.rgw.buckets.index => rgw {}
>
> root@node1:~# radosgw-admin zonegroup placement list
> [
>  {
>  "key": "default-placement",
>  "val": {
>  "name": "default-placement",
>  "tags": [],
>  "storage_classes": [
>  "SPINNING_RUST",
>  "STANDARD"
>  ]
>  }
>  }
> ]
>
> root@node1:~# radosgw-admin zone placement list
> [
>  {
>  "key": "default-placement",
>  "val": {
>  "index_pool": "default.rgw.buckets.index",
>  "storage_classes": {
>  "SPINNING_RUST": {
>  "data_pool": "tier2-hdd"
>  },
>  "STANDARD": {
>  "data_pool": "tier1-ssd"
>  }
>  },
>  "data_extra_pool": "default.rgw.buckets.non-ec",
>  "index_type": 0
>  }
>  }
> ]
> 
>
> I can also post the relevant s3cmd commands for putting objects and
> setting the storage class, but perhaps this is already enough
> information. Please let me know.
>
> 
> root@node1:~# rados -p tier1-ssd ls
> ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1_darthvader.png
> ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1_2019-10-15-090436_1254x522_scrubbed.png
> ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1_kanariepiet.jpg
>
> root@node1:~# rados -p tier2-hdd ls
> ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1__shadow_.FEruUOZaVJXJcOG-e2tO1xcInNzoEvN_0
>
> $ s3cmd info s3://bucket/kanariepiet.jpg
> [snip]
> Last mod:  Tue, 10 Dec 2019 08:09:58 GMT
> Storage:   STANDARD
> [snip]
>
> $ s3cmd info s3://bucket/darthvader.png
> [snip]
> Last mod:  Wed, 04 Dec 2019 10:35:14 GMT
> Storage:   SPINNING_RUST
> [snip]
>
> $ s3cmd info s3://bucket/2019-10-15-090436_1254x522_scrubb

[ceph-users] Re: [object gateway] setting storage class does not move object to correct backing pool?

2019-12-10 Thread Gerdriaan Mulder

Hi Matt,

On 12/10/19 1:52 PM, Matt Benjamin wrote:

I think actually moving an already-stored object requires a lifecycle
transition policy.  Assuming such a policy exists and matches the
object by prefix/tag/time, it would migrate during an (hopefully the
first) eligible lc processing window.


That would probably be an acceptable alternative. I did have such a 
policy in place to automatically change the storage class (and I 
verified this works from the perspective of s3cmd). But when I looked at 
`rados -p tier1-ssd ls`, the object that I presumed was moved was still 
there.


Another example I just executed: directly setting the storage class 
gives me the following output:


<<<
client $ s3cmd put git-tree.png s3://bucket/ --storage-class=SPINNING_RUST
WARNING: Module python-magic is not available. Guessing MIME types based 
on file extensions.

upload: 'git-tree.png' -> 's3://bucket/git-tree.png'  [1 of 1]
 16490 of 16490   100% in0s   532.37 kB/s  done

client $ s3cmd info s3://bucket/git-tree.png
   Last mod:  Tue, 10 Dec 2019 12:58:52 GMT
   Storage:   SPINNING_RUST
===

On the cluster:

<<<
root@node1:~# rados -p tier2-hdd ls
ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1__shadow_.aQHkk2RcTCHN64E_XJjA0wGTnUtWSN2_0
ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1__shadow_.FEruUOZaVJXJcOG-e2tO1xcInNzoEvN_0
root@node1:~# rados -p tier1-ssd ls
ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1_git-tree.png
ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1_darthvader.png
ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1_2019-10-15-090436_1254x522_scrubbed.png
ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1_kanariepiet.jpg
===

whereas I would expect "git-tree.png" to only reside in the pool 
tier2-hdd. This gives me the suggestion that I made an error in 
configuring the storage_class->pool association.


My hunch is that the zone(group) placement is incorrect, but I can't 
really find clear documentation on that subject.


Any thoughts on that?

Best regards,
Gerdriaan Mulder


On Tue, Dec 10, 2019 at 7:44 AM Gerdriaan Mulder  wrote:


Hi,

If I change the storage class of an object via s3cmd, the object's
storage class is reported as being changed. However, when inspecting
where the objects are placed (via `rados -p  ls`, see further on),
the object seems to be retained in the original pool.

The idea behind this test setup is to simulate two storage locations,
one based on SSDs or similar flash storage, the other on slow HDDs. We
want to be able to alter the storage location of objects on the fly,
typically only from fast to slow storage. The object should then only
reside on slow storage.

The setup is as follows on Nautilus (Ubuntu 16.04, see
 for the
full dump):


root@node1:~# ceph -s
   health: HEALTH_OK

   mon: 3 daemons, quorum node1,node3,node5 (age 12d)
   mgr: node2(active, since 6d), standbys: node4
   osd: 4 osds: 4 up (since 12d), 4 in (since 12d)
   rgw: 1 daemon active (node1)

   pools:   7 pools, 296 pgs
   objects: 229 objects, 192 KiB
   usage:   3.2 GiB used, 6.8 GiB / 10 GiB avail
   pgs: 296 active+clean

root@node1:~# ceph osd tree
ID  CLASS WEIGHT  TYPE NAME   STATUS REWEIGHT PRI-AFF
   -1   0.00970 root default
-16   0.00970 datacenter nijmegen
   -3   0.00388 host node2
0   hdd 0.00388 osd.0   up  1.0 1.0
   -5   0.00388 host node3
1   hdd 0.00388 osd.1   up  1.0 1.0
   -7   0.00098 host node4
2   ssd 0.00098 osd.2   up  1.0 1.0
   -9   0.00098 host node5
3   ssd 0.00098 osd.3   up  1.0 1.0

root@node1:~# ceph osd pool ls detail
pool 1 'tier1-ssd' replicated size 2 min_size 1 crush_rule 1 object_hash
rjenkins pg_num 128 pgp_num 128 [snip] application rgw
pool 2 'tier2-hdd' replicated size 1 min_size 1 crush_rule 2 object_hash
rjenkins pg_num 128 pgp_num 128 [snip] application rgw
pool 3 '.rgw.root' replicated size 2 min_size 1 crush_rule 0 object_hash
rjenkins pg_num 8 pgp_num 8 [snip] application rgw
pool 4 'default.rgw.control' replicated size 2 min_size 1 crush_rule 0
[snip] application rgw
pool 5 'default.rgw.meta' replicated size 2 min_size 1 crush_rule 0
[snip] application rgw
pool 6 'default.rgw.log' replicated size 2 min_size 1 crush_rule 0
[snip] application rgw
pool 7 'default.rgw.buckets.index' replicated size 3 min_size 2
crush_rule 0 [snip] application rgw

root@node1:~# ceph osd pool application get# compacted
tier1-ssd => rgw {}
tier2-hdd => rgw {}
.rgw.root => rgw {}
 default.rgw.control => rgw {}
 default.rgw.meta => rgw {}
 default.rgw.log => rgw {}
 default.rgw.buckets.index => rgw {}

root@node1:~# radosgw-admin zonegroup placement list
[
  {
  "key": "default-placement",
  "val": {
  "name": "default-placement",
 

[ceph-users] Re: [object gateway] setting storage class does not move object to correct backing pool?

2019-12-10 Thread Casey Bodley



On 12/10/19 8:10 AM, Gerdriaan Mulder wrote:

Hi Matt,

On 12/10/19 1:52 PM, Matt Benjamin wrote:

I think actually moving an already-stored object requires a lifecycle
transition policy.  Assuming such a policy exists and matches the
object by prefix/tag/time, it would migrate during an (hopefully the
first) eligible lc processing window.


That would probably be an acceptable alternative. I did have such a 
policy in place to automatically change the storage class (and I 
verified this works from the perspective of s3cmd). But when I looked 
at `rados -p tier1-ssd ls`, the object that I presumed was moved was 
still there.


Another example I just executed: directly setting the storage class 
gives me the following output:


<<<
client $ s3cmd put git-tree.png s3://bucket/ 
--storage-class=SPINNING_RUST
WARNING: Module python-magic is not available. Guessing MIME types 
based on file extensions.

upload: 'git-tree.png' -> 's3://bucket/git-tree.png'  [1 of 1]
 16490 of 16490   100% in    0s   532.37 kB/s  done

client $ s3cmd info s3://bucket/git-tree.png
   Last mod:  Tue, 10 Dec 2019 12:58:52 GMT
   Storage:   SPINNING_RUST
===

On the cluster:

<<<
root@node1:~# rados -p tier2-hdd ls
ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1__shadow_.aQHkk2RcTCHN64E_XJjA0wGTnUtWSN2_0 

ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1__shadow_.FEruUOZaVJXJcOG-e2tO1xcInNzoEvN_0 


root@node1:~# rados -p tier1-ssd ls
ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1_git-tree.png
ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1_darthvader.png
ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1_2019-10-15-090436_1254x522_scrubbed.png 


ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1_kanariepiet.jpg
===

whereas I would expect "git-tree.png" to only reside in the pool 
tier2-hdd. This gives me the suggestion that I made an error in 
configuring the storage_class->pool association.


git-tree.png is the 'head' object, which stores the object's attributes 
(including storage class). Head objects always go in the default storage 
class of the bucket's placement target, because we have to be able to 
find them without knowing their storage class. The actual object data 
gets striped over 'tail' objects, which have funky names with _shadow_ 
in them, and it's these objects that you see placed correctly in the 
tier2-hdd pool.




My hunch is that the zone(group) placement is incorrect, but I can't 
really find clear documentation on that subject.


Any thoughts on that?

Best regards,
Gerdriaan Mulder

On Tue, Dec 10, 2019 at 7:44 AM Gerdriaan Mulder  
wrote:


Hi,

If I change the storage class of an object via s3cmd, the object's
storage class is reported as being changed. However, when inspecting
where the objects are placed (via `rados -p  ls`, see further 
on),

the object seems to be retained in the original pool.

The idea behind this test setup is to simulate two storage locations,
one based on SSDs or similar flash storage, the other on slow HDDs. We
want to be able to alter the storage location of objects on the fly,
typically only from fast to slow storage. The object should then only
reside on slow storage.

The setup is as follows on Nautilus (Ubuntu 16.04, see
 for 
the

full dump):


root@node1:~# ceph -s
   health: HEALTH_OK

   mon: 3 daemons, quorum node1,node3,node5 (age 12d)
   mgr: node2(active, since 6d), standbys: node4
   osd: 4 osds: 4 up (since 12d), 4 in (since 12d)
   rgw: 1 daemon active (node1)

   pools:   7 pools, 296 pgs
   objects: 229 objects, 192 KiB
   usage:   3.2 GiB used, 6.8 GiB / 10 GiB avail
   pgs: 296 active+clean

root@node1:~# ceph osd tree
ID  CLASS WEIGHT  TYPE NAME   STATUS REWEIGHT PRI-AFF
   -1   0.00970 root default
-16   0.00970 datacenter nijmegen
   -3   0.00388 host node2
    0   hdd 0.00388 osd.0   up  1.0 1.0
   -5   0.00388 host node3
    1   hdd 0.00388 osd.1   up  1.0 1.0
   -7   0.00098 host node4
    2   ssd 0.00098 osd.2   up  1.0 1.0
   -9   0.00098 host node5
    3   ssd 0.00098 osd.3   up  1.0 1.0

root@node1:~# ceph osd pool ls detail
pool 1 'tier1-ssd' replicated size 2 min_size 1 crush_rule 1 
object_hash

rjenkins pg_num 128 pgp_num 128 [snip] application rgw
pool 2 'tier2-hdd' replicated size 1 min_size 1 crush_rule 2 
object_hash

rjenkins pg_num 128 pgp_num 128 [snip] application rgw
pool 3 '.rgw.root' replicated size 2 min_size 1 crush_rule 0 
object_hash

rjenkins pg_num 8 pgp_num 8 [snip] application rgw
pool 4 'default.rgw.control' replicated size 2 min_size 1 crush_rule 0
[snip] application rgw
pool 5 'default.rgw.meta' replicated size 2 min_size 1 crush_rule 0
[snip] application rgw
pool 6 'default.rgw.log' replicated size 2 min_size 1 crush_rule 0
[snip] application rgw
pool 7 'default.rgw.buckets.in

[ceph-users] Re: [object gateway] setting storage class does not move object to correct backing pool?

2019-12-10 Thread Gerdriaan Mulder

Hi Casey,

On 12/10/19 3:00 PM, Casey Bodley wrote:


whereas I would expect "git-tree.png" to only reside in the pool 
tier2-hdd. This gives me the suggestion that I made an error in 
configuring the storage_class->pool association.


git-tree.png is the 'head' object, which stores the object's attributes 
(including storage class). Head objects always go in the default storage 
class of the bucket's placement target, because we have to be able to 
find them without knowing their storage class. The actual object data 
gets striped over 'tail' objects, which have funky names with _shadow_ 
in them, and it's these objects that you see placed correctly in the 
tier2-hdd pool.


Thanks for the explanation. It seems the documentation on Nautilus is 
somewhat lacking on these particular intricacies.


Best regards,
Gerdriaan Mulder
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io