Re: [ceph-users] Usage of devices in SSD pool vary very much

2019-01-26 Thread Konstantin Shalygin
On 1/26/19 10:24 PM, Kevin Olbrich wrote: I just had the time to check again: even after removing the broken OSD, mgr still crashes. All OSDs are on and in. If I run "ceph balancer on" on a HEALTH_OK cluster, an optimization plan is generated and started. After some minutes all MGRs die. This is

Re: [ceph-users] How To Properly Failover a HA Setup

2019-01-26 Thread Charles Tassell
I tried setting noout and that did provide a bit better result.  Basically I could stop the OSD on the inactive server and everything still worked (after a 2-3 second pause) but then when I rebooted the inactive server everything hung again until it came back online and resynced with the cluste

[ceph-users] Questions about using existing HW for PoC cluster

2019-01-26 Thread Will Dennis
Hi all, Kind of new to Ceph (have been using 10.2.11 on a 3-node Proxmox 4.x cluster [hyperconverged], works great!) and now I'm thinking of perhaps using it for a bigger data storage project at work, a PoC at first, but built as correctly as possible for performance and availability. I have th

Re: [ceph-users] One host with 24 OSDs is offline - best way to get it back online

2019-01-26 Thread Christian Balzer
Hello, this is where (depending on your topology) something like: --- mon_osd_down_out_subtree_limit = host --- can come in very handy. Provided you have correct monitoring, alerting and operations, recovering a down node can often be restored long before any recovery would be finished and you a

Re: [ceph-users] Bucket logging howto

2019-01-26 Thread Marc Roos
>From the owner account of the bucket I am trying to enable logging, but I don't get how this should work. I see the s3:PutBucketLogging is supported, so I guess this should work. How do you enable it? And how do you access the log? [@ ~]$ s3cmd -c .s3cfg accesslog s3://archive Access logg

[ceph-users] Bucket logging howto

2019-01-26 Thread Marc Roos
>From the owner account of the bucket I am trying to enable logging, but I don't get how this should work. I see the s3:PutBucketLogging is supported, so I guess this should work. How do you enable it? And how do you access the log? [@ ~]$ s3cmd -c .s3cfg accesslog s3://archive Access loggi

Re: [ceph-users] One host with 24 OSDs is offline - best way to get it back online

2019-01-26 Thread Brian Topping
I went through this as I reformatted all the OSDs with a much smaller cluster last weekend. When turning nodes back on, PGs would sometimes move, only to move back, prolonging the operation and system stress. What I took away is it’s least overall system stress to have the OSD tree back to tar

Re: [ceph-users] One host with 24 OSDs is offline - best way to get it back online

2019-01-26 Thread Götz Reinicke
Dear Chris, Thanks for your feedback. The node/OSDs in question are part of an erasure coded pool and during the weekend the workload should be close to none. But anyway, I could get a look on the console and on the server; the power is up, but I cant use any console, the Loginprompt is shown,

Re: [ceph-users] One host with 24 OSDs is offline - best way to get it back online

2019-01-26 Thread Chris
It sort of depends on your workload/use case. Recovery operations can be computationally expensive. If your load is light because its the weekend you should be able to turn that host back on as soon as you resolve whatever the issue is with minimal impact. You can also increase the priority

[ceph-users] One host with 24 OSDs is offline - best way to get it back online

2019-01-26 Thread Götz Reinicke
Hi, one host out of 10 is down for yet unknown reasons. I guess a power failure. I could not yet see the server. The Cluster is recovering and remapping fine, but still has some objects to process. My question: May I just switch the server back on and in best case, the 24 OSDs get back online

Re: [ceph-users] repair do not work for inconsistent pg which three replica are the same

2019-01-26 Thread ceph
Am 10. Januar 2019 08:43:30 MEZ schrieb Wido den Hollander : > > >On 1/10/19 8:36 AM, hnuzhoulin2 wrote: >> >> Hi,cephers >> >> I have two inconsistent pg.I try list inconsistent obj,got nothing. >> >> rados list-inconsistent-obj 388.c29 >> No scrub information available for pg 388.c29 >> erro

Re: [ceph-users] Usage of devices in SSD pool vary very much

2019-01-26 Thread Kevin Olbrich
Hi! I just had the time to check again: even after removing the broken OSD, mgr still crashes. All OSDs are on and in. If I run "ceph balancer on" on a HEALTH_OK cluster, an optimization plan is generated and started. After some minutes all MGRs die. This is a major problem for me, as I still got

Re: [ceph-users] Rezising an online mounted ext4 on a rbd - failed

2019-01-26 Thread Götz Reinicke
> Am 26.01.2019 um 14:16 schrieb Kevin Olbrich : > > Am Sa., 26. Jan. 2019 um 13:43 Uhr schrieb Götz Reinicke > : >> >> Hi, >> >> I have a fileserver which mounted a 4TB rbd, which is ext4 formatted. >> >> I grow that rbd and ext4 starting with an 2TB rbd that way: >> >> rbd resize testpool/

Re: [ceph-users] Rezising an online mounted ext4 on a rbd - failed

2019-01-26 Thread Kevin Olbrich
Am Sa., 26. Jan. 2019 um 13:43 Uhr schrieb Götz Reinicke : > > Hi, > > I have a fileserver which mounted a 4TB rbd, which is ext4 formatted. > > I grow that rbd and ext4 starting with an 2TB rbd that way: > > rbd resize testpool/disk01--size 4194304 > > resize2fs /dev/rbd0 > > Today I wanted to ext

[ceph-users] Rezising an online mounted ext4 on a rbd - failed

2019-01-26 Thread Götz Reinicke
Hi, I have a fileserver which mounted a 4TB rbd, which is ext4 formatted. I grow that rbd and ext4 starting with an 2TB rbd that way: rbd resize testpool/disk01--size 4194304 resize2fs /dev/rbd0 Today I wanted to extend that ext4 to 8 TB and did: rbd resize testpool/disk01--size 8388608 resi

Re: [ceph-users] Migrating to a dedicated cluster network

2019-01-26 Thread Simon Leinen
Paul Emmerich writes: > Split networks is rarely worth it. One fast network is usually better. > And since you mentioned having only two interfaces: one bond is way > better than two independent interfaces. > IPv4/6 dual stack setups will be supported in Nautilus, you currently > have to use eithe

Re: [ceph-users] Using Ceph central backup storage - Best practice creating pools

2019-01-26 Thread Simon Leinen
cmonty14 writes: > due to performance issues RGW is not an option. This statement may be > wrong, but there's the following aspect to consider. > If I write a backup that is typically a large file, this is normally a > single IO stream. > This causes massive performance issues on Ceph because th