Re: [ceph-users] Another OSD broken today. How can I recover it?

2017-12-05 Thread Ronny Aasen
just as long as you are aware that size=3, min_size=2 is the right config for everyone except those that really know what they are doing. and if you ever run min_size=1 you better be expecting to corrupt your cluster sooner or later. Ronny On 05.12.2017 21:22, Denes Dolhay wrote: Hi, So for

Re: [ceph-users] Another OSD broken today. How can I recover it?

2017-12-05 Thread Denes Dolhay
Hi, So for this to happen you have to lose another osd before backfilling is done. Thank You! This clarifies it! Denes On 12/05/2017 03:32 PM, Ronny Aasen wrote: On 05. des. 2017 10:26, Denes Dolhay wrote: Hi, This question popped up a few times already under filestore and bluestore t

Re: [ceph-users] Another OSD broken today. How can I recover it?

2017-12-05 Thread Ronny Aasen
On 05. des. 2017 10:26, Denes Dolhay wrote: Hi, This question popped up a few times already under filestore and bluestore too, but please help me understand, why this is? "when you have 2 different objects, both with correct digests, in your cluster, the cluster can not know witch of the 2 o

Re: [ceph-users] Another OSD broken today. How can I recover it?

2017-12-05 Thread Denes Dolhay
Hi, This question popped up a few times already under filestore and bluestore too, but please help me understand, why this is? "when you have 2 different objects, both with correct digests, in your cluster, the cluster can not know witch of the 2 objects are the correct one." Doesn't it us

Re: [ceph-users] Another OSD broken today. How can I recover it?

2017-12-05 Thread Ronny Aasen
On 05. des. 2017 09:18, Gonzalo Aguilar Delgado wrote: Hi, I created this. http://paste.debian.net/999172/ But the expiration date is too short. So I did this too https://pastebin.com/QfrE71Dg. What I want to mention is that there's no known cause for what's happening. It's true that time de

Re: [ceph-users] Another OSD broken today. How can I recover it?

2017-12-05 Thread Gonzalo Aguilar Delgado
Hi, I created this. http://paste.debian.net/999172/ But the expiration date is too short. So I did this too https://pastebin.com/QfrE71Dg. What I want to mention is that there's no known cause for what's happening. It's true that time desynch happens on reboot because few millis skew. But ntp cor

Re: [ceph-users] Another OSD broken today. How can I recover it?

2017-12-04 Thread Ronny Aasen
On 04. des. 2017 10:22, Gonzalo Aguilar Delgado wrote: Hello, Things are going worse every day. ceph -w     cluster 9028f4da-0d77-462b-be9b-dbdf7fa57771 health HEALTH_ERR     1 pgs are stuck inactive for more than 300 seconds     8 pgs inconsistent     1 pgs r

Re: [ceph-users] Another OSD broken today. How can I recover it?

2017-12-04 Thread Gonzalo Aguilar Delgado
h the 3 replica's is dat you >> have enough copies to recover from a failed osd. In my tests this seems >> to go fine automatically. Are you doing something that is not adviced? >> >> >> >> >> -----Original Message- >> From: Gonzalo Aguilar

Re: [ceph-users] Another OSD broken today. How can I recover it?

2017-12-03 Thread Gonzalo Aguilar Delgado
ests this seems > to go fine automatically. Are you doing something that is not adviced? > > > > > -Original Message- > From: Gonzalo Aguilar Delgado [mailto:gagui...@aguilardelgado.com] > Sent: zaterdag 25 november 2017 20:44 > To: 'ceph-users' >

Re: [ceph-users] Another OSD broken today. How can I recover it?

2017-11-26 Thread Marc Roos
gagui...@aguilardelgado.com] Sent: zaterdag 25 november 2017 20:44 To: 'ceph-users' Subject: [ceph-users] Another OSD broken today. How can I recover it? Hello, I had another blackout with ceph today. It seems that ceph osd's fall from time to time and they are unable to recover. I ha

[ceph-users] Another OSD broken today. How can I recover it?

2017-11-25 Thread Gonzalo Aguilar Delgado
Hello, I had another blackout with ceph today. It seems that ceph osd's fall from time to time and they are unable to recover. I have 3 OSD's down now. 1 removed from the cluster and 2 down because I'm unable to recover them. We really need a recovery tool. It's not normal that an OSD breaks and