[ceph-users] Re: Interruption of rebalancing

2023-03-01 Thread Eugen Block

Hi,

if your failure domain is "host" and you have enough redundancy (e.g.  
replicated size 3 or proper erasure-code profiles and rulesets) you  
should be able to reboot without any issue. Depending on how long the  
reboot would take, you could set the noout flag, the default are 10  
minutes until OSDs are marked "out".


Regards,
Eugen

Zitat von Jeffrey Turmelle :

I have a Nautilus cluster with 7 nodes, 210 HDDs.  I recently added  
the 7th node with 30 OSDs which are currently rebalancing very  
slowly.  I just noticed that the ethernet interface only negotiated  
a 1Gb connection, even though it has a 10Gb interface.  I’m not sure  
why, but would like to reboot the node to get the interface back to  
10Gb.


Is it ok to do this?  What should I do to prep the cluster for the reboot?


Jeffrey Turmelle
International Research Institute for Climate & Society  

The Climate School  at Columbia  
University 

845-652-3461



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How do I troubleshoot radosgw errors STS?

2023-03-01 Thread Pritha Srivastava
I will look into the bug that you submitted.

Thanks,
Pritha

On Thu, Mar 2, 2023 at 3:46 AM  wrote:

> Hello,
>
> I just submitted: https://tracker.ceph.com/issues/58890
>
> Here are more details about the configuration. Note that I've tried a URL
> with and without a trailing `/` slash like what appears in the ISS.
>
> STS OpenIDConnectProvider
>
> 
> {
>   "ClientIDList": [
> "radosgw"
>   ],
>   "CreateDate": "2023-03-01T04:05:45.93+00:00",
>   "ThumbprintList": [
> "16A1FBBEE0DC3F78C2013326B2EBA2B9F6D59575"
>   ],
>   "Url": "https://login.lab/application/o/d7d64496e26c156ca9ea0802c5d7ed1c
> "
> }
> 
>
> Role document with the ARN used in the AssumeRoleWithIdentity call. The
> token returns a "sub" claim with the value of "mathew.utter", e.g. me.
>
> 
> {
> "RoleId": "53186307-cc98-4904-b867-aa6c2fb10291",
> "RoleName": "AssumeRoleWithWebIdentityForOIDC",
> "Path": "/",
> "Arn": "arn:aws:iam:::role/AssumeRoleWithWebIdentityForOIDC",
> "CreateDate": "2023-03-01T04:05:46.417Z",
> "MaxSessionDuration": 3600,
> "AssumeRolePolicyDocument":
> "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"Federated\":[\"arn:aws:iam:::oidc-provider/login.lab/application/o/d7d64496e26c156ca9ea0802c5d7ed1c\"]},\"Action\":[\"sts:AssumeRoleWithWebIdentity\"],\"Condition\":{\"StringEquals\":{\"login.lab/application/o/d7d64496e26c156ca9ea0802c5d7ed1c:sub\":\"mathew.utter\"}}}]}"
> }
> 
>
> Policy attached to the role:
>
> 
> {
> "Permission policy":
> "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Action\":[\"s3:*\"],\"Resource\":[\"arn:aws:s3:::*\"]}]}"
> }
> 
>
>
> There would be a role and policy created for each OIDC user, which is why
> I'm user the "sub" in the Role.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] RadosGW multipart fragments not being cleaned up by lifecycle policy on Quincy

2023-03-01 Thread Sean Houghton
The latest version of quincy seems to be having problems cleaning up multipart 
fragments from canceled uploads. 

The bucket is empty:

% s3cmd -c .s3cfg ls s3://warp-benchmark
%

However, it's got 11TB of data and 700k objects.

# radosgw-admin bucket stats --bucket=warp-benchmark
{
"bucket": "warp-benchmark",
"num_shards": 10,
"tenant": "",
"zonegroup": "6be863e8-a9f2-42c9-b114-c8651b1f1afa",
"placement_rule": "ssd.ec63",
"explicit_placement": {
"data_pool": "",
"data_extra_pool": "",
"index_pool": ""
},
"id": "aa099b5e-01d5-4394-b287-df99a4d63298.18924.1",
"marker": "aa099b5e-01d5-4394-b287-df99a4d63298.37403.1",
"index_type": "Normal",
"owner": "warp_benchmark",
"ver": 
"0#5580404,1#5593184,2#5586262,3#5591427,4#5591937,5#5588120,6#5589760,7#5582923,8#5579062,9#5578699",
"master_ver": "0#0,1#0,2#0,3#0,4#0,5#0,6#0,7#0,8#0,9#0",
"mtime": "0.00",
"creation_time": "2023-02-10T21:45:12.721604Z",
"max_marker": "0#,1#,2#,3#,4#,5#,6#,7#,8#,9#",
"usage": {
"rgw.main": {
"size": 12047620866048,
"size_actual": 12047620866048,
"size_utilized": 12047620866048,
"size_kb": 11765254752,
"size_kb_actual": 11765254752,
"size_kb_utilized": 11765254752,
"num_objects": 736113
}
},
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
}
}

A bucket list shows that they are all multipart fragments

# radosgw-admin bucket list --bucket=warp-benchmark
[
... (LOTS OF THESE)
{
"name": 
"_multipart_(2F3(gCS/1.GagoUCrCRqawswb6.rnd.tg1efLm7-es41Xg3i-Nm6bYjS-c-No79.12",
"instance": "",
"ver": {
"pool": 20,
"epoch": 30984
},
"locator": "",
"exists": "true",
"meta": {
"category": 1,
"size": 16777216,
"mtime": "2023-02-16T00:03:01.586472Z",
"etag": "e7475bca6a58de35648ca5f25d6653bf",
"storage_class": "",
"owner": "warp_benchmark",
"owner_display_name": "Warp Benchmark",
"content_type": "",
"accounted_size": 16777216,
"user_data": "",
"appendable": "false"
},
"tag": "_YdopX7yxnVrvg2h35MIQGN3vsPyZx5W",
"flags": 0,
"pending_map": [],
"versioned_epoch": 0
}
]

Note that the timestamp is from 2 weeks ago so a lifecycle policy of "cleanup 
after 1 day" should delete them.

cat cleanup-multipart.xml


abort-multipart-rule



Enabled

  1




% s3cmd dellifecycle s3://warp-benchmark
s3://warp-benchmark/: Lifecycle Policy deleted
% s3cmd setlifecycle cleanup-multipart.xml s3://warp-benchmark
s3://warp-benchmark/: Lifecycle Policy updated

A secondary problem is that the lifecycle policy never runs automatically and 
is stuck in the UNINITIAL state. This problem is for another day of debugging.

# radosgw-admin lc list
[
{
"bucket": 
":warp-benchmark:aa099b5e-01d5-4394-b287-df99a4d63298.37403.1",
"started": "Thu, 01 Jan 1970 00:00:00 GMT",
"status": "UNINITIAL"
}
]

However, it can be started manually

# radosgw-admin lc process
# radosgw-admin lc list
[
{
"bucket": 
":warp-benchmark:aa099b5e-01d5-4394-b287-df99a4d63298.37403.1",
"started": "Wed, 01 Mar 2023 17:35:27 GMT",
"status": "COMPLETE"
}
]

This has no effect on the bucket and the bucket stats show the exact same size 
and object count (output omitted for brevity).

Running a gc pass also has no effect

# radosgw-admin gc list 
[]
# radosgw-admin gc process
# radosgw-admin gc list
[]


Any ideas?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How do I troubleshoot radosgw errors STS?

2023-03-01 Thread mat
Hello,

I just submitted: https://tracker.ceph.com/issues/58890 

Here are more details about the configuration. Note that I've tried a URL with 
and without a trailing `/` slash like what appears in the ISS.

STS OpenIDConnectProvider 


{
  "ClientIDList": [
"radosgw"
  ],
  "CreateDate": "2023-03-01T04:05:45.93+00:00",
  "ThumbprintList": [
"16A1FBBEE0DC3F78C2013326B2EBA2B9F6D59575"
  ],
  "Url": "https://login.lab/application/o/d7d64496e26c156ca9ea0802c5d7ed1c;
}


Role document with the ARN used in the AssumeRoleWithIdentity call. The token 
returns a "sub" claim with the value of "mathew.utter", e.g. me.


{
"RoleId": "53186307-cc98-4904-b867-aa6c2fb10291",
"RoleName": "AssumeRoleWithWebIdentityForOIDC",
"Path": "/",
"Arn": "arn:aws:iam:::role/AssumeRoleWithWebIdentityForOIDC",
"CreateDate": "2023-03-01T04:05:46.417Z",
"MaxSessionDuration": 3600,
"AssumeRolePolicyDocument": 
"{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"Federated\":[\"arn:aws:iam:::oidc-provider/login.lab/application/o/d7d64496e26c156ca9ea0802c5d7ed1c\"]},\"Action\":[\"sts:AssumeRoleWithWebIdentity\"],\"Condition\":{\"StringEquals\":{\"login.lab/application/o/d7d64496e26c156ca9ea0802c5d7ed1c:sub\":\"mathew.utter\"}}}]}"
}


Policy attached to the role:


{
"Permission policy": 
"{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Action\":[\"s3:*\"],\"Resource\":[\"arn:aws:s3:::*\"]}]}"
}



There would be a role and policy created for each OIDC user, which is why I'm 
user the "sub" in the Role.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: PG Sizing Question

2023-03-01 Thread Anthony D'Atri


> By the sounds of it, a cluster may be configured for the 100 PG / OSD target; 
> adding pools to the former configuration scenario will require an increase in 
> OSDs to maintain that recommended PG distribution target and accommodate an 
> increase of PGs resulting from additional pools.

To be clear, 100 or 200 or any other number is a target, not a precise goal.  
In the past one could increase the number of PGs in a given pool but not 
decrease; with Nautilus we got the ability to decrease.

So if you forsee adding more pools, you either just add them and the effective 
ratio increases a bit, or adjust some existing pools down.  Adding OSDs does 
have the effect of reducing the number of PGs per (absent other topology 
changes at the same time).  So if you think you’re going to be adding pools, 
maybe aim lower rather than higher.

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: PG Sizing Question

2023-03-01 Thread Deep Dish
Thank you for this perspective Anthony.

I was honestly hoping the autoscaler would work in my case, however I had
less than desired results with it.   On 17.2.5 it actually failed to scale
as advertised.   I had a pool created via the web console, with 1 PG, then
kicked off a job to migrate data.   Understandably the cluster wasn't
optimal with several tens of terabytes in this pool with 1 PG.   So I've
been manually scaling since.   I am using 12Gbit/s SAS spinners on variants
of 6Gbit/s or 12Gbit/s backplanes.   Either way, 4-6Gbit/s of throughput
per OSD is designed in with each OSD node.

Memory (assuming monitors) is something that can be adjusted as well.   I
did notice that with many pools (10+) and a total target of 100 PGs / OSD
across the cluster, it's somewhat difficult to attain an even
distribution across all OSDs, leaving some running warmer than others in
terms of capacity utilization leading to risk of prematurely filling up.

I was hoping the guidance would be per pool vs. cluster wide for PG / OSD.
If this is indeed the recommended spec, I'll have to rethink the pools we
have and their purpose / utilization.  Looking forward to additional
perspectives, best practices around this.  By the sounds of it, a cluster
may be configured for the 100 PG / OSD target; adding pools to the former
configuration scenario will require an increase in OSDs to maintain that
recommended PG distribution target and accommodate an increase of PGs
resulting from additional pools.

On Wed, Mar 1, 2023 at 12:58 AM Anthony D'Atri  wrote:

> This can be subtle and is easy to mix up.
>
> The “PG ratio” is intended to be the number of PGs hosted on each OSD,
> plus or minus a few.
>
> Note how I phrased that, it’s not the number of PGs divided by the number
> of OSDs.  Remember that PGs are replicated.
>
> While each PG belongs to exactly one pool, for purposes of estimating
> pg_num, we calculate the desired aggregate number of PGs on this ratio,
> then divide that up among pools, ideally split into powers of 2 per pool,
> relative to the amount of data in the pool.
>
> You can run `ceph osd df` and see the number of PGs on each OSD.  There
> will be some variance, but consider the average.
>
> This venerable calculator:
>
> [image: Ceph_Logo_Stacked_RGB_120411_fa.png]
>
> PGCalc 
> old.ceph.com 
> 
>
> can help get a feel for how this works.
>
> 100 is the official party line, it used to be 200.  More PGs means more
> memory use; too few has various other drawbacks.
>
> PGs can in part be thought of as parallelism domains; more PGs means more
> parallelism.  So on HDDs, a ratio in the 100-200 range is IMHO reasonable.
> SAS/SATA OSDs 200-300, NVMe OSDs perhaps higher, though perhaps not if each
> device hosts more than one OSD (which should only ever be done on NVMe
> devices).
>
> Your numbers below are probably ok for HDDs, you might bump the pool with
> the most data up to the next power of 2 if these are SSDs.
>
> The pgcalc above includes parameters for what fraction of the cluster’s
> data each pool contains.  A pool with 5% of the data needs fewer PGs than a
> pool with 50% of the cluster’s data.
>
> Others may well have different perspectives, this is something where
> opinions vary.  The pg_autoscaler in bulk mode can automate this, if one is
> prescient with feeding it parameters.
>
>
>
> On Feb 28, 2023, at 9:23 PM, Deep Dish  wrote:
>
> Hello
>
>
>
> Looking to get some official guidance on PG and PGP sizing.
>
>
>
> Is the goal to maintain approximately 100 PGs per OSD per pool or for the
> cluster general?
>
>
>
> Assume the following scenario:
>
>
>
> Cluster with 80 OSD across 8 nodes;
>
> 3 Pools:
>
> -   Pool1 = Replicated 3x
>
> -   Pool2 = Replicated 3x
>
> -   Pool3 = Erasure Coded 6-4
>
>
>
>
>
> Assuming the well published formula:
>
>
>
> Let (Target PGs / OSD) = 100
>
>
>
> [ (Target PGs / OSD) * (# of OSDs) ] / (Replica Size)
>
>
>
> -   Pool1 = (100*80)/3 = 2666.67 => 4096
>
> -   Pool2 = (100*80)/3 = 2666.67 => 4096
>
> -   Pool3 = (100*80)/10 = 800 => 1024
>
>
>
> Total cluster would have 9216 PGs and PGPs.
>
>
> Are there any implications (performance / monitor / MDS / RGW sizing) with
> how many PGs are created on the cluster?
>
>
>
> Looking for validation and / or clarification of the above.
>
>
>
> Thank you.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Interruption of rebalancing

2023-03-01 Thread Jeffrey Turmelle
I have a Nautilus cluster with 7 nodes, 210 HDDs.  I recently added the 7th 
node with 30 OSDs which are currently rebalancing very slowly.  I just noticed 
that the ethernet interface only negotiated a 1Gb connection, even though it 
has a 10Gb interface.  I’m not sure why, but would like to reboot the node to 
get the interface back to 10Gb.

Is it ok to do this?  What should I do to prep the cluster for the reboot?


Jeffrey Turmelle
International Research Institute for Climate & Society 

The Climate School  at Columbia University 

845-652-3461



signature.asc
Description: Message signed with OpenPGP
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How do I troubleshoot radosgw errors STS?

2023-03-01 Thread hazmat
Also, here is the ceph version: ceph version 17.2.5 
(e04241aa9b639588fa6c864845287d2824cb6b55) quincy (stable)


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How do I troubleshoot radosgw errors STS?

2023-03-01 Thread Pritha Srivastava
Hi,

What version of ceph are you using? Can you share the trust policy that is
attached to the role being assumed?

Thanks,
Pritha

On Wed, Mar 1, 2023 at 9:07 PM  wrote:

> I've setup RadosGW with STS ontop of my ceph cluster. It works great and
> fine but I'm also trying to setup authentication with an OpenIDConnect
> provider. I'm have a hard time troubleshooting issues because the radosgw
> log file doesn't have much information in it. For example when I try to use
> the `sts:AssumeRoleWithWebIdentity` API it fails with `{'Code':
> 'AccessDenied', ...}` and all I see is the beat log showing an HTTP 403.
>
> Is there a way to enable more verbose logging so I can see what is failing
> and why I'm getting certain errors with STS, S3, or IAM apis?
>
> My ceph.conf looks like this for each node (mildly redacted):
>
> ```
> [client.radosgw.pve4]
> host = pve4
> keyring = /etc/pve/priv/ceph.client.radosgw.keyring
> log file = /var/log/ceph/client.radosgw.$host.log
> rgw_dns_name = s3.lab
> rgw_frontends = beast endpoint=0.0.0.0:7480 ssl_endpoint=0.0.0.0:443
> ssl_certificate=/etc/pve/priv/ceph/s3.lab.crt
> ssl_private_key=/etc/pve/priv/ceph/s3.lab.key
> rgw_sts_key = 
> rgw_s3_auth_use_sts = true
> rgw_enable_apis = s3, s3website, admin, sts, iam
> ```
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] How do I troubleshoot radosgw errors STS?

2023-03-01 Thread mat
I've setup RadosGW with STS ontop of my ceph cluster. It works great and fine 
but I'm also trying to setup authentication with an OpenIDConnect provider. I'm 
have a hard time troubleshooting issues because the radosgw log file doesn't 
have much information in it. For example when I try to use the 
`sts:AssumeRoleWithWebIdentity` API it fails with `{'Code': 'AccessDenied', 
...}` and all I see is the beat log showing an HTTP 403.

Is there a way to enable more verbose logging so I can see what is failing and 
why I'm getting certain errors with STS, S3, or IAM apis?

My ceph.conf looks like this for each node (mildly redacted):

```
[client.radosgw.pve4]
host = pve4
keyring = /etc/pve/priv/ceph.client.radosgw.keyring
log file = /var/log/ceph/client.radosgw.$host.log
rgw_dns_name = s3.lab
rgw_frontends = beast endpoint=0.0.0.0:7480 ssl_endpoint=0.0.0.0:443 
ssl_certificate=/etc/pve/priv/ceph/s3.lab.crt 
ssl_private_key=/etc/pve/priv/ceph/s3.lab.key
rgw_sts_key = 
rgw_s3_auth_use_sts = true
rgw_enable_apis = s3, s3website, admin, sts, iam
```
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Next quincy release (17.2.6)

2023-03-01 Thread Yuri Weinstein
As sepia stabilizes we want to plan for this release ASAP.
If you still have outstanding PRs for pls "needs-qa" for 'quincy'
milestone and make sure that they are approved and passing checks.

Thx

On Fri, Feb 17, 2023 at 8:24 AM Yuri Weinstein  wrote:
>
> Hello
>
> We are planning to start QE validation release next week.
> If you have PRs that are to be part of it, please let us know by
> adding "needs-qa" for 'quincy' milestone ASAP.
>
> Thx
> YuriW
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: s3 compatible interface

2023-03-01 Thread Daniel Gryniewicz
We're actually writing this for RGW right now.  It'll be a bit before 
it's productized, but it's in the works.


Daniel

On 2/28/23 14:13, Fox, Kevin M wrote:

Minio no longer lets you read / write from the posix side. Only through minio 
itself. :(

Haven't found a replacement yet. If you do, please let me know.

Thanks,
Kevin


From: Robert Sander 
Sent: Tuesday, February 28, 2023 9:37 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: s3 compatible interface

Check twice before you click! This email originated from outside PNNL.


On 28.02.23 16:31, Marc wrote:


Anyone know of a s3 compatible interface that I can just run, and reads/writes 
files from a local file system and not from object storage?


Have a look at Minio:

https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmin.io%2Fproduct%2Foverview%23architecture=05%7C01%7Ckevin.fox%40pnnl.gov%7Cfbffadde8e0a45e1d18308db19b2b714%7Cd6faa5f90ae240338c0130048a38deeb%7C0%7C0%7C638132027594291339%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=uPhkVghMl%2B%2BU75ddjwv9FMaLlAHO4GgkcreH5bZFIm0%3D=0

Regards
--
Robert Sander
Heinlein Support GmbH
Linux: Akademie - Support - Hosting
https://gcc02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.heinlein-support.de%2F=05%7C01%7Ckevin.fox%40pnnl.gov%7Cfbffadde8e0a45e1d18308db19b2b714%7Cd6faa5f90ae240338c0130048a38deeb%7C0%7C0%7C638132027594291339%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=ciJR1pAWHTbtBbpJJ6GDtcBl7pUJdnU8C5ZBLoWlcaM%3D=0

Tel: 030-405051-43
Fax: 030-405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein  -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph 16.2.10 - misplaced object after changing crush map only setting hdd class

2023-03-01 Thread Eugen Block

Hi,

do you know if your crush tree already had the "shadow" tree (probably  
not)? If there wasn't a shadow-tree ("default~hdd") then the remapping  
is expected. What exact version did you install this cluster with?


storage01:~ # ceph osd crush tree --show-shadow
ID  CLASS  WEIGHT   TYPE NAME
-2hdd  0.05699  root default~hdd
-4hdd  0.05699  host storage01~hdd
 0hdd  0.01900  osd.0
 1hdd  0.01900  osd.1
 2hdd  0.01900  osd.2
-1 0.05800  root default
-3 0.05800  host storage01
 0hdd  0.01900  osd.0
 1hdd  0.01900  osd.1
 2hdd  0.01900  osd.2


Zitat von xadhoo...@gmail.com:


Hi to all and thanks for sharing your experience on ceph !
We have an easy setup with 9 osd all hdd and 3 nodes, 3 osd for each node.
We started the cluster to test how it works with hdd with default  
and easy bootstrap . Then we decide to add ssd and create a pool to  
use only ssd.
In order to have pools on hdd  and pools on ssd only we edited the  
crushmap to add class hdd
We do not enter anything about ssd till now, nor disk or rules only  
add the class map to the default rule.

So i show you the rules  before introducing class hdd
# rules
rule replicated_rule {
id 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}
rule erasure-code {
id 1
type erasure
min_size 3
max_size 4
step set_chooseleaf_tries 5
step set_choose_tries 100
step take default
step chooseleaf indep 0 type host
step emit
}
rule erasure2_1 {
id 2
type erasure
min_size 3
max_size 3
step set_chooseleaf_tries 5
step set_choose_tries 100
step take default
step chooseleaf indep 0 type host
step emit
}
rule erasure-pool.meta {
id 3
type erasure
min_size 3
max_size 3
step set_chooseleaf_tries 5
step set_choose_tries 100
step take default
step chooseleaf indep 0 type host
step emit
}
rule erasure-pool.data {
id 4
type erasure
min_size 3
max_size 3
step set_chooseleaf_tries 5
step set_choose_tries 100
step take default
step chooseleaf indep 0 type host
step emit
}

And  here is the after

# rules
rule replicated_rule {
id 0
type replicated
min_size 1
max_size 10
step take default class hdd
step chooseleaf firstn 0 type host
step emit
}
rule erasure-code {
id 1
type erasure
min_size 3
max_size 4
step set_chooseleaf_tries 5
step set_choose_tries 100
step take default class hdd
step chooseleaf indep 0 type host
step emit
}
rule erasure2_1 {
id 2
type erasure
min_size 3
max_size 3
step set_chooseleaf_tries 5
step set_choose_tries 100
step take default class hdd
step chooseleaf indep 0 type host
step emit
}
rule erasure-pool.meta {
id 3
type erasure
min_size 3
max_size 3
step set_chooseleaf_tries 5
step set_choose_tries 100
step take default class hdd
step chooseleaf indep 0 type host
step emit
}
rule erasure-pool.data {
id 4
type erasure
min_size 3
max_size 3
step set_chooseleaf_tries 5
step set_choose_tries 100
step take default class hdd
step chooseleaf indep 0 type host
step emit
}
Just doing this triggered the misplaced of all pgs bind to EC pool.

Is that correct ? and why ?
Best regards
Alessandro Bolgia
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io