[ceph-users] Re: Some hint for a DELL PowerEdge T440/PERC H750 Controller...

2023-04-06 Thread Matthias Ferdinand
spinners are slow anyway, but on top of that SAS disks often default to
writecache=off. In use as a single disk with no risk of raid
write-holes, you can turn on writecache. On SAS, I would assume the
firmware does not lie about writes reaching stable storage (flushes).

# turn on temporarily:
sdparm --set=WCE /dev/sdX

# turn on persistently:
sdparm --set=WCE --save /dev/sdX


To check current state:

sdparm --get=WCE /dev/sdf
/dev/sdf: SEAGATE   ST2000NM0045  DS03
WCE 0  [cha: y, def:  0, sav:  0]

"WCE 0" means: off
"sav: 0" means: off next time the disk is powered on


Matthias


On Thu, Apr 06, 2023 at 09:26:27AM -0400, Anthony D'Atri wrote:
> How bizarre, I haven’t dealt with this specific SKU before.  Some Dell / LSI 
> HBAs call this passthrough mode, some “personality”, some “jbod mode”, dunno 
> why they can’t be consistent.
> 
> 
> > We are testing an experimental Ceph cluster with server and controller at
> > subject.
> > 
> > The controller have not an HBA mode, but only a 'NonRAID' mode, come sort of
> > 'auto RAID0' configuration.
> 
> Dell’s CLI guide describes setting individual drives in Non-RAID, which 
> *smells* like passthrough, not the more-complex RAID0 workaround we had to do 
> before passthrough.
> 
> https://www.dell.com/support/manuals/en-nz/perc-h750-sas/perc_cli_rg/set-drive-state-commands?guid=guid-d4750845-1f57-434c-b4a9-935876ee1a8e=en-us
> > 
> > We are using SSD SATA disks (MICRON MTFDDAK480TDT) that perform very well,
> > and SAS HDD disks (SEAGATE ST8000NM014A) that instead perform very bad
> > (particulary, very low IOPS).
> 
> Spinners are slow, this is news?
> 
> That said, how slow is slow?  Testing commands and results or it didn’t 
> happen.
> 
> Also, firmware matters.  Run Dell’s DSU.
> 
> > There's some hint for disk/controller configuration/optimization?
> 
> Give us details, perccli /c0 show, test results etc.  
> 
> Use a different HBA if you have to use an HBA, one that doesn’t suffer an 
> RoC.  Better yet, take an expansive look at TCO and don’t write off NVMe as 
> infeasible.  If your cluster is experimental hopefully you aren’t stuck with 
> a lot of these.  Add up the cost of an RoC HBA, optionally with cache RAM and 
> BBU/supercap, add in the cost delta for SAS HDDs over SATA.  Add in the 
> operational hassle of managing WAL+DB on those boot SSDs.  Add in the extra 
> HDDs you’ll need to provision because of IOPS. 
> 
> > 
> > 
> > Thanks.
> > 
> > -- 
> >  Io credo nella chimica tanto quanto Giulio Cesare credeva nel caso...
> >  mi va bene fino a quando non riguarda me :)(Emanuele Pucciarelli)
> > 
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Some hint for a DELL PowerEdge T440/PERC H750 Controller...

2023-04-06 Thread Anthony D'Atri
How bizarre, I haven’t dealt with this specific SKU before.  Some Dell / LSI 
HBAs call this passthrough mode, some “personality”, some “jbod mode”, dunno 
why they can’t be consistent.


> We are testing an experimental Ceph cluster with server and controller at
> subject.
> 
> The controller have not an HBA mode, but only a 'NonRAID' mode, come sort of
> 'auto RAID0' configuration.

Dell’s CLI guide describes setting individual drives in Non-RAID, which 
*smells* like passthrough, not the more-complex RAID0 workaround we had to do 
before passthrough.

https://www.dell.com/support/manuals/en-nz/perc-h750-sas/perc_cli_rg/set-drive-state-commands?guid=guid-d4750845-1f57-434c-b4a9-935876ee1a8e=en-us
> 
> We are using SSD SATA disks (MICRON MTFDDAK480TDT) that perform very well,
> and SAS HDD disks (SEAGATE ST8000NM014A) that instead perform very bad
> (particulary, very low IOPS).

Spinners are slow, this is news?

That said, how slow is slow?  Testing commands and results or it didn’t happen.

Also, firmware matters.  Run Dell’s DSU.

> There's some hint for disk/controller configuration/optimization?

Give us details, perccli /c0 show, test results etc.  

Use a different HBA if you have to use an HBA, one that doesn’t suffer an RoC.  
Better yet, take an expansive look at TCO and don’t write off NVMe as 
infeasible.  If your cluster is experimental hopefully you aren’t stuck with a 
lot of these.  Add up the cost of an RoC HBA, optionally with cache RAM and 
BBU/supercap, add in the cost delta for SAS HDDs over SATA.  Add in the 
operational hassle of managing WAL+DB on those boot SSDs.  Add in the extra 
HDDs you’ll need to provision because of IOPS. 

> 
> 
> Thanks.
> 
> -- 
>  Io credo nella chimica tanto quanto Giulio Cesare credeva nel caso...
>  mi va bene fino a quando non riguarda me :)  (Emanuele Pucciarelli)
> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upgrading from Pacific to Quincy fails with "Unexpected error"

2023-04-06 Thread Adam King
Does "ceph health detail" give any insight into what the unexpected
exception was? If not, I'm pretty confident some traceback would end up
being logged. Could maybe still grab it with "ceph log last 200 info
cephadm" if not a lot else has happened. Also, probably need to find out if
the check-host is failing due to the check on the host actually failing or
failing to connect to the host. Could try putting a copy of the cephadm
binary on one and running "cephadm check-host --expect-hostname "
where the hostname is the name cephadm knows the host by. If that's not an
issue I'd expect it's a connection thing. Could maybe try going through
 https://docs.ceph.com/en/quincy/cephadm/troubleshooting/#ssh-errors
.
Cephadm changed the backend ssh library from pacific to quincy due to the
one used in pacific no longer being supported so it's possible some general
ssh error has popped up in your env as a result.

On Thu, Apr 6, 2023 at 8:38 AM Reza Bakhshayeshi 
wrote:

> Hi all,
>
> I have a problem regarding upgrading Ceph cluster from Pacific to Quincy
> version with cephadm. I have successfully upgraded the cluster to the
> latest Pacific (16.2.11). But when I run the following command to upgrade
> the cluster to 17.2.5, After upgrading 3/4 mgrs, the upgrade process stops
> with "Unexpected error". (everything is on a private network)
>
> ceph orch upgrade start my-private-repo/quay-io/ceph/ceph:v17.2.5
>
> I also tried the 17.2.4 version.
>
> cephadm fails to check the hosts' status and marks them as offline:
>
> cephadm 2023-04-06T10:19:59.998510+ mgr.host9.arhpnd (mgr.4516356) 5782
> : cephadm [DBG]  host host4 (x.x.x.x) failed check
> cephadm 2023-04-06T10:19:59.998553+ mgr.host9.arhpnd (mgr.4516356) 5783
> : cephadm [DBG] Host "host4" marked as offline. Skipping daemon refresh
> cephadm 2023-04-06T10:19:59.998581+ mgr.host9.arhpnd (mgr.4516356) 5784
> : cephadm [DBG] Host "host4" marked as offline. Skipping gather facts
> refresh
> cephadm 2023-04-06T10:19:59.998609+ mgr.host9.arhpnd (mgr.4516356) 5785
> : cephadm [DBG] Host "host4" marked as offline. Skipping network refresh
> cephadm 2023-04-06T10:19:59.998633+ mgr.host9.arhpnd (mgr.4516356) 5786
> : cephadm [DBG] Host "host4" marked as offline. Skipping device refresh
> cephadm 2023-04-06T10:19:59.998659+ mgr.host9.arhpnd (mgr.4516356) 5787
> : cephadm [DBG] Host "host4" marked as offline. Skipping osdspec preview
> refresh
> cephadm 2023-04-06T10:19:59.998682+ mgr.host9.arhpnd (mgr.4516356) 5788
> : cephadm [DBG] Host "host4" marked as offline. Skipping autotune
> cluster 2023-04-06T10:20:00.000151+ mon.host8 (mon.0) 158587 : cluster
> [ERR] Health detail: HEALTH_ERR 9 hosts fail cephadm check; Upgrade: failed
> due to an unexpected exception
> cluster 2023-04-06T10:20:00.000191+ mon.host8 (mon.0) 158588 : cluster
> [ERR] [WRN] CEPHADM_HOST_CHECK_FAILED: 9 hosts fail cephadm check
> cluster 2023-04-06T10:20:00.000202+ mon.host8 (mon.0) 158589 : cluster
> [ERR] host host7 (x.x.x.x) failed check: Unable to reach remote host
> host7. Process exited with non-zero exit status 3
> cluster 2023-04-06T10:20:00.000213+ mon.host8 (mon.0) 158590 : cluster
> [ERR] host host2 (x.x.x.x) failed check: Unable to reach remote host
> host2. Process exited with non-zero exit status 3
> cluster 2023-04-06T10:20:00.000220+ mon.host8 (mon.0) 158591 : cluster
> [ERR] host host8 (x.x.x.x) failed check: Unable to reach remote host
> host8. Process exited with non-zero exit status 3
> cluster 2023-04-06T10:20:00.000228+ mon.host8 (mon.0) 158592 : cluster
> [ERR] host host4 (x.x.x.x) failed check: Unable to reach remote host
> host4. Process exited with non-zero exit status 3
> cluster 2023-04-06T10:20:00.000240+ mon.host8 (mon.0) 158593 : cluster
> [ERR] host host3 (x.x.x.x) failed check: Unable to reach remote host
> host3. Process exited with non-zero exit status 3
>
> and here are some outputs of the commands:
>
> [root@host8 ~]# ceph -s
>   cluster:
> id: xxx
> health: HEALTH_ERR
> 9 hosts fail cephadm check
> Upgrade: failed due to an unexpected exception
>
>   services:
> mon: 5 daemons, quorum host8,host1,host7,host2,host9 (age 2w)
> mgr: host9.arhpnd(active, since 105m), standbys: host8.jowfih,
> host1.warjsr, host2.qyavjj
> mds: 1/1 daemons up, 3 standby
> osd: 37 osds: 37 up (since 8h), 37 in (since 3w)
>
>   data:
>
>
>   io:
> client:
>
>   progress:
> Upgrade to 17.2.5 (0s)
>   []
>
> [root@host8 ~]# ceph orch upgrade status
> {
> "target_image": "my-private-repo/quay-io/ceph/ceph@sha256
> :34c763383e3323c6bb35f3f2229af9f466518d9db926111277f5e27ed543c427",
> "in_progress": true,
> "which": "Upgrading all daemon types on all hosts",
> "services_complete": [],
> "progress": "3/59 daemons upgraded",
> "message": 

[ceph-users] Upgrading from Pacific to Quincy fails with "Unexpected error"

2023-04-06 Thread Reza Bakhshayeshi
Hi all,

I have a problem regarding upgrading Ceph cluster from Pacific to Quincy
version with cephadm. I have successfully upgraded the cluster to the
latest Pacific (16.2.11). But when I run the following command to upgrade
the cluster to 17.2.5, After upgrading 3/4 mgrs, the upgrade process stops
with "Unexpected error". (everything is on a private network)

ceph orch upgrade start my-private-repo/quay-io/ceph/ceph:v17.2.5

I also tried the 17.2.4 version.

cephadm fails to check the hosts' status and marks them as offline:

cephadm 2023-04-06T10:19:59.998510+ mgr.host9.arhpnd (mgr.4516356) 5782
: cephadm [DBG]  host host4 (x.x.x.x) failed check
cephadm 2023-04-06T10:19:59.998553+ mgr.host9.arhpnd (mgr.4516356) 5783
: cephadm [DBG] Host "host4" marked as offline. Skipping daemon refresh
cephadm 2023-04-06T10:19:59.998581+ mgr.host9.arhpnd (mgr.4516356) 5784
: cephadm [DBG] Host "host4" marked as offline. Skipping gather facts
refresh
cephadm 2023-04-06T10:19:59.998609+ mgr.host9.arhpnd (mgr.4516356) 5785
: cephadm [DBG] Host "host4" marked as offline. Skipping network refresh
cephadm 2023-04-06T10:19:59.998633+ mgr.host9.arhpnd (mgr.4516356) 5786
: cephadm [DBG] Host "host4" marked as offline. Skipping device refresh
cephadm 2023-04-06T10:19:59.998659+ mgr.host9.arhpnd (mgr.4516356) 5787
: cephadm [DBG] Host "host4" marked as offline. Skipping osdspec preview
refresh
cephadm 2023-04-06T10:19:59.998682+ mgr.host9.arhpnd (mgr.4516356) 5788
: cephadm [DBG] Host "host4" marked as offline. Skipping autotune
cluster 2023-04-06T10:20:00.000151+ mon.host8 (mon.0) 158587 : cluster
[ERR] Health detail: HEALTH_ERR 9 hosts fail cephadm check; Upgrade: failed
due to an unexpected exception
cluster 2023-04-06T10:20:00.000191+ mon.host8 (mon.0) 158588 : cluster
[ERR] [WRN] CEPHADM_HOST_CHECK_FAILED: 9 hosts fail cephadm check
cluster 2023-04-06T10:20:00.000202+ mon.host8 (mon.0) 158589 : cluster
[ERR] host host7 (x.x.x.x) failed check: Unable to reach remote host
host7. Process exited with non-zero exit status 3
cluster 2023-04-06T10:20:00.000213+ mon.host8 (mon.0) 158590 : cluster
[ERR] host host2 (x.x.x.x) failed check: Unable to reach remote host
host2. Process exited with non-zero exit status 3
cluster 2023-04-06T10:20:00.000220+ mon.host8 (mon.0) 158591 : cluster
[ERR] host host8 (x.x.x.x) failed check: Unable to reach remote host
host8. Process exited with non-zero exit status 3
cluster 2023-04-06T10:20:00.000228+ mon.host8 (mon.0) 158592 : cluster
[ERR] host host4 (x.x.x.x) failed check: Unable to reach remote host
host4. Process exited with non-zero exit status 3
cluster 2023-04-06T10:20:00.000240+ mon.host8 (mon.0) 158593 : cluster
[ERR] host host3 (x.x.x.x) failed check: Unable to reach remote host
host3. Process exited with non-zero exit status 3

and here are some outputs of the commands:

[root@host8 ~]# ceph -s
  cluster:
id: xxx
health: HEALTH_ERR
9 hosts fail cephadm check
Upgrade: failed due to an unexpected exception

  services:
mon: 5 daemons, quorum host8,host1,host7,host2,host9 (age 2w)
mgr: host9.arhpnd(active, since 105m), standbys: host8.jowfih,
host1.warjsr, host2.qyavjj
mds: 1/1 daemons up, 3 standby
osd: 37 osds: 37 up (since 8h), 37 in (since 3w)

  data:


  io:
client:

  progress:
Upgrade to 17.2.5 (0s)
  []

[root@host8 ~]# ceph orch upgrade status
{
"target_image": "my-private-repo/quay-io/ceph/ceph@sha256
:34c763383e3323c6bb35f3f2229af9f466518d9db926111277f5e27ed543c427",
"in_progress": true,
"which": "Upgrading all daemon types on all hosts",
"services_complete": [],
"progress": "3/59 daemons upgraded",
"message": "Error: UPGRADE_EXCEPTION: Upgrade: failed due to an
unexpected exception",
"is_paused": true
}
[root@host8 ~]# ceph cephadm check-host host7
check-host failed:
Host 'host7' not found. Use 'ceph orch host ls' to see all managed hosts.
[root@host8 ~]# ceph versions
{
"mon": {
"ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894)
pacific (stable)": 5
},
"mgr": {
"ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894)
pacific (stable)": 1,
"ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757)
quincy (stable)": 3
},
"osd": {
"ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894)
pacific (stable)": 37
},
"mds": {
"ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894)
pacific (stable)": 4
},
"overall": {
"ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894)
pacific (stable)": 47,
"ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757)
quincy (stable)": 3
}
}

The strange thing is I can rollback the cluster status by failing to
not-upgraded mgr like this:

ceph mgr fail
ceph orch upgrade start 

[ceph-users] Some hint for a DELL PowerEdge T440/PERC H750 Controller...

2023-04-06 Thread Marco Gaiarin


We are testing an experimental Ceph cluster with server and controller at
subject.

The controller have not an HBA mode, but only a 'NonRAID' mode, come sort of
'auto RAID0' configuration.

We are using SSD SATA disks (MICRON MTFDDAK480TDT) that perform very well,
and SAS HDD disks (SEAGATE ST8000NM014A) that instead perform very bad
(particulary, very low IOPS).


There's some hint for disk/controller configuration/optimization?


Thanks.

-- 
  Io credo nella chimica tanto quanto Giulio Cesare credeva nel caso...
  mi va bene fino a quando non riguarda me :)   (Emanuele Pucciarelli)

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Help needed to configure erasure coding LRC plugin

2023-04-06 Thread Michel Jouvin

Hi,

Is somebody using LRC plugin ?

I came to the conclusion that LRC  k=9, m=3, l=4 is not the same as 
jerasure k=9, m=6 in terms of protection against failures and that I 
should use k=9, m=6, l=5 to get a level of resilience >= jerasure k=9, 
m=6. The example in the documentation (k=4, m=2, l=3) suggests that this 
LRC configuration gives something better than jerasure k=4, m=2 as it is 
resilient to 3 drive failures (but not 4 if I understood properly). So 
how many drives can fail in the k=9, m=6, l=5 configuration first 
without loosing RW access and second without loosing data?


Another thing that I don't quite understand is that a pool created with 
this configuration (and failure domain=osd, locality=datacenter) has a 
min_size=3 (max_size=18 as expected). It seems wrong to me, I'd expected 
something ~10 (depending on answer to the previous question)...


Thanks in advance if somebody could provide some sort of authoritative 
answer on these 2 questions. Best regards,


Michel

Le 04/04/2023 à 15:53, Michel Jouvin a écrit :
Answering to myself, I found the reason for 2147483647: it's 
documented as a failure to find enough OSD (missing OSDs). And it is 
normal as I selected different hosts for the 15 OSDs but I have only 
12 hosts!


I'm still interested by an "expert" to confirm that LRC  k=9, m=3, l=4 
configuration is equivalent, in terms of redundancy, to a jerasure 
configuration with k=9, m=6.


Michel

Le 04/04/2023 à 15:26, Michel Jouvin a écrit :

Hi,

As discussed in another thread (Crushmap rule for multi-datacenter 
erasure coding), I'm trying to create an EC pool spanning 3 
datacenters (datacenters are present in the crushmap), with the 
objective to be resilient to 1 DC down, at least keeping the readonly 
access to the pool and if possible the read-write access, and have a 
storage efficiency better than 3 replica (let say a storage overhead 
<= 2).


In the discussion, somebody mentioned LRC plugin as a possible 
jerasure alternative to implement this without tweaking the crushmap 
rule to implement the 2-step OSD allocation. I looked at the 
documentation 
(https://docs.ceph.com/en/latest/rados/operations/erasure-code-lrc/) 
but I have some questions if someone has experience/expertise with 
this LRC plugin.


I tried to create a rule for using 5 OSDs per datacenter (15 in 
total), with 3 (9 in total) being data chunks and others being coding 
chunks. For this, based of my understanding of examples, I used k=9, 
m=3, l=4. Is it right? Is this configuration equivalent, in terms of 
redundancy, to a jerasure configuration with k=9, m=6?


The resulting rule, which looks correct to me, is:



{
    "rule_id": 6,
    "rule_name": "test_lrc_2",
    "ruleset": 6,
    "type": 3,
    "min_size": 3,
    "max_size": 15,
    "steps": [
    {
    "op": "set_chooseleaf_tries",
    "num": 5
    },
    {
    "op": "set_choose_tries",
    "num": 100
    },
    {
    "op": "take",
    "item": -4,
    "item_name": "default~hdd"
    },
    {
    "op": "choose_indep",
    "num": 3,
    "type": "datacenter"
    },
    {
    "op": "chooseleaf_indep",
    "num": 5,
    "type": "host"
    },
    {
    "op": "emit"
    }
    ]
}



Unfortunately, it doesn't work as expected: a pool created with this 
rule ends up with its pages active+undersize, which is unexpected for 
me. Looking at 'ceph health detail` output, I see for each page 
something like:


pg 52.14 is stuck undersized for 27m, current state 
active+undersized, last acting 
[90,113,2147483647,103,64,147,164,177,2147483647,133,58,28,8,32,2147483647]


For each PG, there is 3 '2147483647' entries and I guess it is the 
reason of the problem. What are these entries about? Clearly it is 
not OSD entries... Looks like a negative number, -1, which in terms 
of crushmap ID is the crushmap root (named "default" in our 
configuration). Any trivial mistake I would have made?


Thanks in advance for any help or for sharing any successful 
configuration?


Best regards,

Michel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Misplaced objects greater than 100%

2023-04-06 Thread Joachim Kraftmayer
Perhaps this option triggered the crush map change:

osd crush update on start

Each time the OSD starts, it verifies it is in the correct location in
the CRUSH map and, if it is not, it moves itself.

 https://docs.ceph.com/en/quincy/rados/operations/crush-map/

Joachim


Johan Hattne  schrieb am Mi., 5. Apr. 2023, 22:21:

> I think this is resolved—and you're right about the 0-weight of the root
> bucket being strange. I had created the rack buckets with
>
> # ceph osd crush add-bucket rack-0 rack
>
> whereas I should have used something like
>
> # ceph osd crush add-bucket rack-0 rack root=default
>
> There's a bit in the documentation
> (https://docs.ceph.com/en/quincy/rados/operations/crush-map) that says
> "Not all keys need to be specified" (in a different context, I admit).
>
> I might have saved a second or two by omitting "root=default" and maybe
> half a minute by not checking the CRUSH map carefully afterwards.  It
> was not worth it.
>
> // J
>
> On 2023-04-05 12:01, c...@elchaka.de wrote:
> > I guess this is related to your crush rules..
> > Unfortunaly i dont know much about creating the rules...
> >
> > But someone cloud give more insights when you also provide
> >
> > crush rule dump
> >
> >  your "-1 0 root default" is a bit strange
> >
> >
> > Am 1. April 2023 01:01:39 MESZ schrieb Johan Hattne :
> >
> > Here goes:
> >
> > # ceph -s
> >cluster:
> >  id: e1327a10-8b8c-11ed-88b9-3cecef0e3946
> >  health: HEALTH_OK
> >
> >services:
> >  mon: 5 daemons, quorum
> bcgonen-a,bcgonen-b,bcgonen-c,bcgonen-r0h0,bcgonen-r0h1 (age 16h)
> >  mgr: bcgonen-b.furndm(active, since 8d), standbys:
> bcgonen-a.qmmqxj
> >  mds: 1/1 daemons up, 2 standby
> >  osd: 36 osds: 36 up (since 16h), 36 in (since 3d); 1041
> remapped pgs
> >
> >data:
> >  volumes: 1/1 healthy
> >  pools:   3 pools, 1041 pgs
> >  objects: 5.42M objects, 6.5 TiB
> >  usage:   19 TiB used, 428 TiB / 447 TiB avail
> >  pgs: 27087125/16252275 objects misplaced (166.667%)
> >   1039 active+clean+remapped
> >   2active+clean+remapped+scrubbing+deep
> >
> > # ceph osd tree
> > ID   CLASS  WEIGHT TYPE NAME  STATUS  REWEIGHT
> PRI-AFF
> > -14 149.02008  rack rack-1
> >   -7 149.02008  host bcgonen-r1h0
> >   20hdd   14.55269  osd.20 up   1.0
> 1.0
> >   21hdd   14.55269  osd.21 up   1.0
> 1.0
> >   22hdd   14.55269  osd.22 up   1.0
> 1.0
> >   23hdd   14.55269  osd.23 up   1.0
> 1.0
> >   24hdd   14.55269  osd.24 up   1.0
> 1.0
> >   25hdd   14.55269  osd.25 up   1.0
> 1.0
> >   26hdd   14.55269  osd.26 up   1.0
> 1.0
> >   27hdd   14.55269  osd.27 up   1.0
> 1.0
> >   28hdd   14.55269  osd.28 up   1.0
> 1.0
> >   29hdd   14.55269  osd.29 up   1.0
> 1.0
> >   34ssd1.74660  osd.34 up   1.0
> 1.0
> >   35ssd1.74660  osd.35 up   1.0
> 1.0
> > -13 298.04016  rack rack-0
> >   -3 149.02008  host bcgonen-r0h0
> >0hdd   14.55269  osd.0  up   1.0
> 1.0
> >1hdd   14.55269  osd.1  up   1.0
> 1.0
> >2hdd   14.55269  osd.2  up   1.0
> 1.0
> >3hdd   14.55269  osd.3  up   1.0
> 1.0
> >4hdd   14.55269  osd.4  up   1.0
> 1.0
> >5hdd   14.55269  osd.5  up   1.0
> 1.0
> >6hdd   14.55269  osd.6  up   1.0
> 1.0
> >7hdd   14.55269  osd.7  up   1.0
> 1.0
> >8hdd   14.55269  osd.8  up   1.0
> 1.0
> >9hdd   14.55269  osd.9  up   1.0
> 1.0
> >   30ssd1.74660  osd.30 up   1.0
> 1.0
> >   31ssd1.74660  osd.31 up   1.0
> 1.0
> >   -5 149.02008  host bcgonen-r0h1
> >   10hdd   14.55269  osd.10 up   1.0
> 1.0
> >   11hdd   14.55269  osd.11 up   1.0
> 1.0
> >   12hdd   14.55269  osd.12 up   1.0
> 1.0
> >   13hdd   14.55269  osd.13 up   1.0
> 1.0
> >   14hdd   14.55269  osd.14 up   1.0
> 1.0
> >   15hdd   14.55269  osd.15