Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-22 Thread Jason Dillaman
FYI -- I opened a tracker ticket [1] against ceph-osd for this issue
so that it doesn't get dropped.

[1] http://tracker.ceph.com/issues/20041

On Mon, May 22, 2017 at 9:57 AM, Stefan Priebe - Profihost AG
 wrote:
> Hello Jason,
>
> attached is the thread apply all bt outout while a pg is deadlocked in
> scrubbing.
>
> The core dump is 5.5GB / 500MB compressed.
>
> pg_stat objects mip degrmispunf bytes   log disklog
> state   state_stamp v   reportedup  up_primary
> acting  acting_primary  last_scrub  scrub_stamp last_deep_scrub
> deep_scrub_stamp
>
> 3.137   20280   0   0   0   7236313088  3007
> 3007active+clean+scrubbing  2017-05-21 04:45:30.321710
> 453856'20824101  453856:25450494 [116,37,85] 116 [116,37,85]
> 116 452296'20769703 2017-05-19 23:40:14.150887  451238'20721095
>2017-05-17 21:25:09.174598
>
> Greets,
> Stefan
>
> Am 22.05.2017 um 15:32 schrieb Jason Dillaman:
>> If you have the debug symbols installed, I'd say "thread apply all bt"
>> in addition to a "generate-core-file". The backtrace would only be
>> useful if it showed a thread deadlocked on something.
>>
>> On Mon, May 22, 2017 at 9:29 AM, Stefan Priebe - Profihost AG
>>  wrote:
>>> Hello Jason,
>>>
>>> should i do a coredump or a thread apply all bt?
>>>
>>> Don't know what is better.
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 22.05.2017 um 15:19 schrieb Jason Dillaman:
 If you cannot recreate with debug logging enabled, that might be the
 next best option.

 On Mon, May 22, 2017 at 2:30 AM, Stefan Priebe - Profihost AG
  wrote:
> Hello Jason,
>
> i had another 8 cases where scrub was running for hours. Sadly i
> couldn't get it to hang again after an osd restart. Any further ideas?
>
> Coredump of the OSD with hanging scrub?
>
> Greets,
> Stefan
>
> Am 18.05.2017 um 17:26 schrieb Jason Dillaman:
>> I'm unfortunately out of ideas at the moment. I think the best chance
>> of figuring out what is wrong is to repeat it while logs are enabled.
>>
>> On Wed, May 17, 2017 at 4:51 PM, Stefan Priebe - Profihost AG
>>  wrote:
>>> No i can't reproduce it with active logs. Any furthr ideas?
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 17.05.2017 um 21:26 schrieb Stefan Priebe - Profihost AG:
 Am 17.05.2017 um 21:21 schrieb Jason Dillaman:
> Any chance you still have debug logs enabled on OSD 23 after you
> restarted it and the scrub froze again?

 No but i can do that ;-) Hopefully it freezes again.

 Stefan

>
> On Wed, May 17, 2017 at 3:19 PM, Stefan Priebe - Profihost AG
>  wrote:
>> Hello,
>>
>> now it shows again:
 4095 active+clean
1 active+clean+scrubbing
>>
>> and:
>> # ceph pg dump | grep -i scrub
>> dumped all in format plain
>> pg_stat objects mip degrmispunf bytes   log 
>> disklog
>> state   state_stamp v   reportedup  up_primary
>> acting  acting_primary  last_scrub  scrub_stamp 
>> last_deep_scrub
>> deep_scrub_stamp
>> 2.aa40400   0   0   0   10128667136 3010
>> 3010active+clean+scrubbing  2017-05-11 09:37:37.962700
>> 181936'11196478  181936:8688051  [23,41,9]   23  [23,41,9]
>> 23  176730'10793226 2017-05-10 03:43:20.849784  
>> 171715'10548192
>>2017-05-04 14:27:39.210713
>>
>> So it seems the same scrub is stuck again... even after restarting 
>> the
>> osd. It just took some time until the scrub of this pg happened 
>> again.
>>
>> Greets,
>> Stefan
>> Am 17.05.2017 um 21:13 schrieb Jason Dillaman:
>>> Can you share your current OSD configuration? It's very curious that
>>> your scrub is getting randomly stuck on a few objects for hours at a
>>> time until an OSD is reset.
>>>
>>> On Wed, May 17, 2017 at 2:55 PM, Stefan Priebe - Profihost AG
>>>  wrote:
 Hello Jason,

 minutes ago i had another case where i restarted the osd which was 
 shown
 in objecter_requests output.

 It seems also other scrubs and deep scrubs were hanging.

 Output before:
 4095 active+clean
1 active+clean+scrubbing

 Output after restart:
 4084 active+clean
7 

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-22 Thread Stefan Priebe - Profihost AG
Hello Jason,

attached is the thread apply all bt outout while a pg is deadlocked in
scrubbing.

The core dump is 5.5GB / 500MB compressed.

pg_stat objects mip degrmispunf bytes   log disklog
state   state_stamp v   reportedup  up_primary
acting  acting_primary  last_scrub  scrub_stamp last_deep_scrub
deep_scrub_stamp

3.137   20280   0   0   0   7236313088  3007
3007active+clean+scrubbing  2017-05-21 04:45:30.321710
453856'20824101  453856:25450494 [116,37,85] 116 [116,37,85]
116 452296'20769703 2017-05-19 23:40:14.150887  451238'20721095
   2017-05-17 21:25:09.174598

Greets,
Stefan

Am 22.05.2017 um 15:32 schrieb Jason Dillaman:
> If you have the debug symbols installed, I'd say "thread apply all bt"
> in addition to a "generate-core-file". The backtrace would only be
> useful if it showed a thread deadlocked on something.
> 
> On Mon, May 22, 2017 at 9:29 AM, Stefan Priebe - Profihost AG
>  wrote:
>> Hello Jason,
>>
>> should i do a coredump or a thread apply all bt?
>>
>> Don't know what is better.
>>
>> Greets,
>> Stefan
>>
>> Am 22.05.2017 um 15:19 schrieb Jason Dillaman:
>>> If you cannot recreate with debug logging enabled, that might be the
>>> next best option.
>>>
>>> On Mon, May 22, 2017 at 2:30 AM, Stefan Priebe - Profihost AG
>>>  wrote:
 Hello Jason,

 i had another 8 cases where scrub was running for hours. Sadly i
 couldn't get it to hang again after an osd restart. Any further ideas?

 Coredump of the OSD with hanging scrub?

 Greets,
 Stefan

 Am 18.05.2017 um 17:26 schrieb Jason Dillaman:
> I'm unfortunately out of ideas at the moment. I think the best chance
> of figuring out what is wrong is to repeat it while logs are enabled.
>
> On Wed, May 17, 2017 at 4:51 PM, Stefan Priebe - Profihost AG
>  wrote:
>> No i can't reproduce it with active logs. Any furthr ideas?
>>
>> Greets,
>> Stefan
>>
>> Am 17.05.2017 um 21:26 schrieb Stefan Priebe - Profihost AG:
>>> Am 17.05.2017 um 21:21 schrieb Jason Dillaman:
 Any chance you still have debug logs enabled on OSD 23 after you
 restarted it and the scrub froze again?
>>>
>>> No but i can do that ;-) Hopefully it freezes again.
>>>
>>> Stefan
>>>

 On Wed, May 17, 2017 at 3:19 PM, Stefan Priebe - Profihost AG
  wrote:
> Hello,
>
> now it shows again:
>>> 4095 active+clean
>>>1 active+clean+scrubbing
>
> and:
> # ceph pg dump | grep -i scrub
> dumped all in format plain
> pg_stat objects mip degrmispunf bytes   log 
> disklog
> state   state_stamp v   reportedup  up_primary
> acting  acting_primary  last_scrub  scrub_stamp 
> last_deep_scrub
> deep_scrub_stamp
> 2.aa40400   0   0   0   10128667136 3010
> 3010active+clean+scrubbing  2017-05-11 09:37:37.962700
> 181936'11196478  181936:8688051  [23,41,9]   23  [23,41,9]
> 23  176730'10793226 2017-05-10 03:43:20.849784  
> 171715'10548192
>2017-05-04 14:27:39.210713
>
> So it seems the same scrub is stuck again... even after restarting the
> osd. It just took some time until the scrub of this pg happened again.
>
> Greets,
> Stefan
> Am 17.05.2017 um 21:13 schrieb Jason Dillaman:
>> Can you share your current OSD configuration? It's very curious that
>> your scrub is getting randomly stuck on a few objects for hours at a
>> time until an OSD is reset.
>>
>> On Wed, May 17, 2017 at 2:55 PM, Stefan Priebe - Profihost AG
>>  wrote:
>>> Hello Jason,
>>>
>>> minutes ago i had another case where i restarted the osd which was 
>>> shown
>>> in objecter_requests output.
>>>
>>> It seems also other scrubs and deep scrubs were hanging.
>>>
>>> Output before:
>>> 4095 active+clean
>>>1 active+clean+scrubbing
>>>
>>> Output after restart:
>>> 4084 active+clean
>>>7 active+clean+scrubbing+deep
>>>5 active+clean+scrubbing
>>>
>>> both values are changing every few seconds again doing a lot of 
>>> scrubs
>>> and deep scubs.
>>>
>>> Greets,
>>> Stefan
>>> Am 17.05.2017 um 20:36 schrieb Stefan Priebe - Profihost AG:
 Hi,

 that command does not 

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-22 Thread Jason Dillaman
If you have the debug symbols installed, I'd say "thread apply all bt"
in addition to a "generate-core-file". The backtrace would only be
useful if it showed a thread deadlocked on something.

On Mon, May 22, 2017 at 9:29 AM, Stefan Priebe - Profihost AG
 wrote:
> Hello Jason,
>
> should i do a coredump or a thread apply all bt?
>
> Don't know what is better.
>
> Greets,
> Stefan
>
> Am 22.05.2017 um 15:19 schrieb Jason Dillaman:
>> If you cannot recreate with debug logging enabled, that might be the
>> next best option.
>>
>> On Mon, May 22, 2017 at 2:30 AM, Stefan Priebe - Profihost AG
>>  wrote:
>>> Hello Jason,
>>>
>>> i had another 8 cases where scrub was running for hours. Sadly i
>>> couldn't get it to hang again after an osd restart. Any further ideas?
>>>
>>> Coredump of the OSD with hanging scrub?
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 18.05.2017 um 17:26 schrieb Jason Dillaman:
 I'm unfortunately out of ideas at the moment. I think the best chance
 of figuring out what is wrong is to repeat it while logs are enabled.

 On Wed, May 17, 2017 at 4:51 PM, Stefan Priebe - Profihost AG
  wrote:
> No i can't reproduce it with active logs. Any furthr ideas?
>
> Greets,
> Stefan
>
> Am 17.05.2017 um 21:26 schrieb Stefan Priebe - Profihost AG:
>> Am 17.05.2017 um 21:21 schrieb Jason Dillaman:
>>> Any chance you still have debug logs enabled on OSD 23 after you
>>> restarted it and the scrub froze again?
>>
>> No but i can do that ;-) Hopefully it freezes again.
>>
>> Stefan
>>
>>>
>>> On Wed, May 17, 2017 at 3:19 PM, Stefan Priebe - Profihost AG
>>>  wrote:
 Hello,

 now it shows again:
>> 4095 active+clean
>>1 active+clean+scrubbing

 and:
 # ceph pg dump | grep -i scrub
 dumped all in format plain
 pg_stat objects mip degrmispunf bytes   log disklog
 state   state_stamp v   reportedup  up_primary
 acting  acting_primary  last_scrub  scrub_stamp last_deep_scrub
 deep_scrub_stamp
 2.aa40400   0   0   0   10128667136 3010
 3010active+clean+scrubbing  2017-05-11 09:37:37.962700
 181936'11196478  181936:8688051  [23,41,9]   23  [23,41,9]
 23  176730'10793226 2017-05-10 03:43:20.849784  171715'10548192
2017-05-04 14:27:39.210713

 So it seems the same scrub is stuck again... even after restarting the
 osd. It just took some time until the scrub of this pg happened again.

 Greets,
 Stefan
 Am 17.05.2017 um 21:13 schrieb Jason Dillaman:
> Can you share your current OSD configuration? It's very curious that
> your scrub is getting randomly stuck on a few objects for hours at a
> time until an OSD is reset.
>
> On Wed, May 17, 2017 at 2:55 PM, Stefan Priebe - Profihost AG
>  wrote:
>> Hello Jason,
>>
>> minutes ago i had another case where i restarted the osd which was 
>> shown
>> in objecter_requests output.
>>
>> It seems also other scrubs and deep scrubs were hanging.
>>
>> Output before:
>> 4095 active+clean
>>1 active+clean+scrubbing
>>
>> Output after restart:
>> 4084 active+clean
>>7 active+clean+scrubbing+deep
>>5 active+clean+scrubbing
>>
>> both values are changing every few seconds again doing a lot of 
>> scrubs
>> and deep scubs.
>>
>> Greets,
>> Stefan
>> Am 17.05.2017 um 20:36 schrieb Stefan Priebe - Profihost AG:
>>> Hi,
>>>
>>> that command does not exist.
>>>
>>> But at least ceph -s permanently reports 1 pg in scrubbing with no 
>>> change.
>>>
>>> Log attached as well.
>>>
>>> Greets,
>>> Stefan
>>> Am 17.05.2017 um 20:20 schrieb Jason Dillaman:
 Does your ceph status show pg 2.cebed0aa (still) scrubbing? Sure 
 -- I
 can quickly scan the new log if you directly send it to me.

 On Wed, May 17, 2017 at 2:18 PM, Stefan Priebe - Profihost AG
  wrote:
> can send the osd log - if you want?
>
> Stefan
>
> Am 17.05.2017 um 20:13 schrieb Stefan Priebe - Profihost AG:
>> Hello Jason,
>>
>> the command
>> # rados -p cephstor6 rm 

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-22 Thread Stefan Priebe - Profihost AG
Hello Jason,

should i do a coredump or a thread apply all bt?

Don't know what is better.

Greets,
Stefan

Am 22.05.2017 um 15:19 schrieb Jason Dillaman:
> If you cannot recreate with debug logging enabled, that might be the
> next best option.
> 
> On Mon, May 22, 2017 at 2:30 AM, Stefan Priebe - Profihost AG
>  wrote:
>> Hello Jason,
>>
>> i had another 8 cases where scrub was running for hours. Sadly i
>> couldn't get it to hang again after an osd restart. Any further ideas?
>>
>> Coredump of the OSD with hanging scrub?
>>
>> Greets,
>> Stefan
>>
>> Am 18.05.2017 um 17:26 schrieb Jason Dillaman:
>>> I'm unfortunately out of ideas at the moment. I think the best chance
>>> of figuring out what is wrong is to repeat it while logs are enabled.
>>>
>>> On Wed, May 17, 2017 at 4:51 PM, Stefan Priebe - Profihost AG
>>>  wrote:
 No i can't reproduce it with active logs. Any furthr ideas?

 Greets,
 Stefan

 Am 17.05.2017 um 21:26 schrieb Stefan Priebe - Profihost AG:
> Am 17.05.2017 um 21:21 schrieb Jason Dillaman:
>> Any chance you still have debug logs enabled on OSD 23 after you
>> restarted it and the scrub froze again?
>
> No but i can do that ;-) Hopefully it freezes again.
>
> Stefan
>
>>
>> On Wed, May 17, 2017 at 3:19 PM, Stefan Priebe - Profihost AG
>>  wrote:
>>> Hello,
>>>
>>> now it shows again:
> 4095 active+clean
>1 active+clean+scrubbing
>>>
>>> and:
>>> # ceph pg dump | grep -i scrub
>>> dumped all in format plain
>>> pg_stat objects mip degrmispunf bytes   log disklog
>>> state   state_stamp v   reportedup  up_primary
>>> acting  acting_primary  last_scrub  scrub_stamp last_deep_scrub
>>> deep_scrub_stamp
>>> 2.aa40400   0   0   0   10128667136 3010
>>> 3010active+clean+scrubbing  2017-05-11 09:37:37.962700
>>> 181936'11196478  181936:8688051  [23,41,9]   23  [23,41,9]
>>> 23  176730'10793226 2017-05-10 03:43:20.849784  171715'10548192
>>>2017-05-04 14:27:39.210713
>>>
>>> So it seems the same scrub is stuck again... even after restarting the
>>> osd. It just took some time until the scrub of this pg happened again.
>>>
>>> Greets,
>>> Stefan
>>> Am 17.05.2017 um 21:13 schrieb Jason Dillaman:
 Can you share your current OSD configuration? It's very curious that
 your scrub is getting randomly stuck on a few objects for hours at a
 time until an OSD is reset.

 On Wed, May 17, 2017 at 2:55 PM, Stefan Priebe - Profihost AG
  wrote:
> Hello Jason,
>
> minutes ago i had another case where i restarted the osd which was 
> shown
> in objecter_requests output.
>
> It seems also other scrubs and deep scrubs were hanging.
>
> Output before:
> 4095 active+clean
>1 active+clean+scrubbing
>
> Output after restart:
> 4084 active+clean
>7 active+clean+scrubbing+deep
>5 active+clean+scrubbing
>
> both values are changing every few seconds again doing a lot of scrubs
> and deep scubs.
>
> Greets,
> Stefan
> Am 17.05.2017 um 20:36 schrieb Stefan Priebe - Profihost AG:
>> Hi,
>>
>> that command does not exist.
>>
>> But at least ceph -s permanently reports 1 pg in scrubbing with no 
>> change.
>>
>> Log attached as well.
>>
>> Greets,
>> Stefan
>> Am 17.05.2017 um 20:20 schrieb Jason Dillaman:
>>> Does your ceph status show pg 2.cebed0aa (still) scrubbing? Sure -- 
>>> I
>>> can quickly scan the new log if you directly send it to me.
>>>
>>> On Wed, May 17, 2017 at 2:18 PM, Stefan Priebe - Profihost AG
>>>  wrote:
 can send the osd log - if you want?

 Stefan

 Am 17.05.2017 um 20:13 schrieb Stefan Priebe - Profihost AG:
> Hello Jason,
>
> the command
> # rados -p cephstor6 rm rbd_data.21aafa6b8b4567.0aaa
>
> hangs as well. Doing absolutely nothing... waiting forever.
>
> Greets,
> Stefan
>
> Am 17.05.2017 um 17:05 schrieb Jason Dillaman:
>> OSD 23 notes that object 
>> rbd_data.21aafa6b8b4567.0aaa is
>> waiting for a scrub. What happens if you run "rados -p > pool> rm

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-22 Thread Jason Dillaman
If you cannot recreate with debug logging enabled, that might be the
next best option.

On Mon, May 22, 2017 at 2:30 AM, Stefan Priebe - Profihost AG
 wrote:
> Hello Jason,
>
> i had another 8 cases where scrub was running for hours. Sadly i
> couldn't get it to hang again after an osd restart. Any further ideas?
>
> Coredump of the OSD with hanging scrub?
>
> Greets,
> Stefan
>
> Am 18.05.2017 um 17:26 schrieb Jason Dillaman:
>> I'm unfortunately out of ideas at the moment. I think the best chance
>> of figuring out what is wrong is to repeat it while logs are enabled.
>>
>> On Wed, May 17, 2017 at 4:51 PM, Stefan Priebe - Profihost AG
>>  wrote:
>>> No i can't reproduce it with active logs. Any furthr ideas?
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 17.05.2017 um 21:26 schrieb Stefan Priebe - Profihost AG:
 Am 17.05.2017 um 21:21 schrieb Jason Dillaman:
> Any chance you still have debug logs enabled on OSD 23 after you
> restarted it and the scrub froze again?

 No but i can do that ;-) Hopefully it freezes again.

 Stefan

>
> On Wed, May 17, 2017 at 3:19 PM, Stefan Priebe - Profihost AG
>  wrote:
>> Hello,
>>
>> now it shows again:
 4095 active+clean
1 active+clean+scrubbing
>>
>> and:
>> # ceph pg dump | grep -i scrub
>> dumped all in format plain
>> pg_stat objects mip degrmispunf bytes   log disklog
>> state   state_stamp v   reportedup  up_primary
>> acting  acting_primary  last_scrub  scrub_stamp last_deep_scrub
>> deep_scrub_stamp
>> 2.aa40400   0   0   0   10128667136 3010
>> 3010active+clean+scrubbing  2017-05-11 09:37:37.962700
>> 181936'11196478  181936:8688051  [23,41,9]   23  [23,41,9]
>> 23  176730'10793226 2017-05-10 03:43:20.849784  171715'10548192
>>2017-05-04 14:27:39.210713
>>
>> So it seems the same scrub is stuck again... even after restarting the
>> osd. It just took some time until the scrub of this pg happened again.
>>
>> Greets,
>> Stefan
>> Am 17.05.2017 um 21:13 schrieb Jason Dillaman:
>>> Can you share your current OSD configuration? It's very curious that
>>> your scrub is getting randomly stuck on a few objects for hours at a
>>> time until an OSD is reset.
>>>
>>> On Wed, May 17, 2017 at 2:55 PM, Stefan Priebe - Profihost AG
>>>  wrote:
 Hello Jason,

 minutes ago i had another case where i restarted the osd which was 
 shown
 in objecter_requests output.

 It seems also other scrubs and deep scrubs were hanging.

 Output before:
 4095 active+clean
1 active+clean+scrubbing

 Output after restart:
 4084 active+clean
7 active+clean+scrubbing+deep
5 active+clean+scrubbing

 both values are changing every few seconds again doing a lot of scrubs
 and deep scubs.

 Greets,
 Stefan
 Am 17.05.2017 um 20:36 schrieb Stefan Priebe - Profihost AG:
> Hi,
>
> that command does not exist.
>
> But at least ceph -s permanently reports 1 pg in scrubbing with no 
> change.
>
> Log attached as well.
>
> Greets,
> Stefan
> Am 17.05.2017 um 20:20 schrieb Jason Dillaman:
>> Does your ceph status show pg 2.cebed0aa (still) scrubbing? Sure -- I
>> can quickly scan the new log if you directly send it to me.
>>
>> On Wed, May 17, 2017 at 2:18 PM, Stefan Priebe - Profihost AG
>>  wrote:
>>> can send the osd log - if you want?
>>>
>>> Stefan
>>>
>>> Am 17.05.2017 um 20:13 schrieb Stefan Priebe - Profihost AG:
 Hello Jason,

 the command
 # rados -p cephstor6 rm rbd_data.21aafa6b8b4567.0aaa

 hangs as well. Doing absolutely nothing... waiting forever.

 Greets,
 Stefan

 Am 17.05.2017 um 17:05 schrieb Jason Dillaman:
> OSD 23 notes that object rbd_data.21aafa6b8b4567.0aaa 
> is
> waiting for a scrub. What happens if you run "rados -p  
> rm
> rbd_data.21aafa6b8b4567.0aaa" (capturing the OSD 23 
> logs
> during this)? If that succeeds while your VM remains blocked on 
> that
> remove op, it looks like there is some problem in the OSD where 
> ops
> 

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-22 Thread Stefan Priebe - Profihost AG
Hello Jason,

i had another 8 cases where scrub was running for hours. Sadly i
couldn't get it to hang again after an osd restart. Any further ideas?

Coredump of the OSD with hanging scrub?

Greets,
Stefan

Am 18.05.2017 um 17:26 schrieb Jason Dillaman:
> I'm unfortunately out of ideas at the moment. I think the best chance
> of figuring out what is wrong is to repeat it while logs are enabled.
> 
> On Wed, May 17, 2017 at 4:51 PM, Stefan Priebe - Profihost AG
>  wrote:
>> No i can't reproduce it with active logs. Any furthr ideas?
>>
>> Greets,
>> Stefan
>>
>> Am 17.05.2017 um 21:26 schrieb Stefan Priebe - Profihost AG:
>>> Am 17.05.2017 um 21:21 schrieb Jason Dillaman:
 Any chance you still have debug logs enabled on OSD 23 after you
 restarted it and the scrub froze again?
>>>
>>> No but i can do that ;-) Hopefully it freezes again.
>>>
>>> Stefan
>>>

 On Wed, May 17, 2017 at 3:19 PM, Stefan Priebe - Profihost AG
  wrote:
> Hello,
>
> now it shows again:
>>> 4095 active+clean
>>>1 active+clean+scrubbing
>
> and:
> # ceph pg dump | grep -i scrub
> dumped all in format plain
> pg_stat objects mip degrmispunf bytes   log disklog
> state   state_stamp v   reportedup  up_primary
> acting  acting_primary  last_scrub  scrub_stamp last_deep_scrub
> deep_scrub_stamp
> 2.aa40400   0   0   0   10128667136 3010
> 3010active+clean+scrubbing  2017-05-11 09:37:37.962700
> 181936'11196478  181936:8688051  [23,41,9]   23  [23,41,9]
> 23  176730'10793226 2017-05-10 03:43:20.849784  171715'10548192
>2017-05-04 14:27:39.210713
>
> So it seems the same scrub is stuck again... even after restarting the
> osd. It just took some time until the scrub of this pg happened again.
>
> Greets,
> Stefan
> Am 17.05.2017 um 21:13 schrieb Jason Dillaman:
>> Can you share your current OSD configuration? It's very curious that
>> your scrub is getting randomly stuck on a few objects for hours at a
>> time until an OSD is reset.
>>
>> On Wed, May 17, 2017 at 2:55 PM, Stefan Priebe - Profihost AG
>>  wrote:
>>> Hello Jason,
>>>
>>> minutes ago i had another case where i restarted the osd which was shown
>>> in objecter_requests output.
>>>
>>> It seems also other scrubs and deep scrubs were hanging.
>>>
>>> Output before:
>>> 4095 active+clean
>>>1 active+clean+scrubbing
>>>
>>> Output after restart:
>>> 4084 active+clean
>>>7 active+clean+scrubbing+deep
>>>5 active+clean+scrubbing
>>>
>>> both values are changing every few seconds again doing a lot of scrubs
>>> and deep scubs.
>>>
>>> Greets,
>>> Stefan
>>> Am 17.05.2017 um 20:36 schrieb Stefan Priebe - Profihost AG:
 Hi,

 that command does not exist.

 But at least ceph -s permanently reports 1 pg in scrubbing with no 
 change.

 Log attached as well.

 Greets,
 Stefan
 Am 17.05.2017 um 20:20 schrieb Jason Dillaman:
> Does your ceph status show pg 2.cebed0aa (still) scrubbing? Sure -- I
> can quickly scan the new log if you directly send it to me.
>
> On Wed, May 17, 2017 at 2:18 PM, Stefan Priebe - Profihost AG
>  wrote:
>> can send the osd log - if you want?
>>
>> Stefan
>>
>> Am 17.05.2017 um 20:13 schrieb Stefan Priebe - Profihost AG:
>>> Hello Jason,
>>>
>>> the command
>>> # rados -p cephstor6 rm rbd_data.21aafa6b8b4567.0aaa
>>>
>>> hangs as well. Doing absolutely nothing... waiting forever.
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 17.05.2017 um 17:05 schrieb Jason Dillaman:
 OSD 23 notes that object rbd_data.21aafa6b8b4567.0aaa 
 is
 waiting for a scrub. What happens if you run "rados -p  
 rm
 rbd_data.21aafa6b8b4567.0aaa" (capturing the OSD 23 
 logs
 during this)? If that succeeds while your VM remains blocked on 
 that
 remove op, it looks like there is some problem in the OSD where ops
 queued on a scrub are not properly awoken when the scrub completes.

 On Wed, May 17, 2017 at 10:57 AM, Stefan Priebe - Profihost AG
  wrote:
> Hello Jason,
>
> after enabling the log and generating a gcore dump, the request 
> was

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-18 Thread Jason Dillaman
I'm unfortunately out of ideas at the moment. I think the best chance
of figuring out what is wrong is to repeat it while logs are enabled.

On Wed, May 17, 2017 at 4:51 PM, Stefan Priebe - Profihost AG
 wrote:
> No i can't reproduce it with active logs. Any furthr ideas?
>
> Greets,
> Stefan
>
> Am 17.05.2017 um 21:26 schrieb Stefan Priebe - Profihost AG:
>> Am 17.05.2017 um 21:21 schrieb Jason Dillaman:
>>> Any chance you still have debug logs enabled on OSD 23 after you
>>> restarted it and the scrub froze again?
>>
>> No but i can do that ;-) Hopefully it freezes again.
>>
>> Stefan
>>
>>>
>>> On Wed, May 17, 2017 at 3:19 PM, Stefan Priebe - Profihost AG
>>>  wrote:
 Hello,

 now it shows again:
>> 4095 active+clean
>>1 active+clean+scrubbing

 and:
 # ceph pg dump | grep -i scrub
 dumped all in format plain
 pg_stat objects mip degrmispunf bytes   log disklog
 state   state_stamp v   reportedup  up_primary
 acting  acting_primary  last_scrub  scrub_stamp last_deep_scrub
 deep_scrub_stamp
 2.aa40400   0   0   0   10128667136 3010
 3010active+clean+scrubbing  2017-05-11 09:37:37.962700
 181936'11196478  181936:8688051  [23,41,9]   23  [23,41,9]
 23  176730'10793226 2017-05-10 03:43:20.849784  171715'10548192
2017-05-04 14:27:39.210713

 So it seems the same scrub is stuck again... even after restarting the
 osd. It just took some time until the scrub of this pg happened again.

 Greets,
 Stefan
 Am 17.05.2017 um 21:13 schrieb Jason Dillaman:
> Can you share your current OSD configuration? It's very curious that
> your scrub is getting randomly stuck on a few objects for hours at a
> time until an OSD is reset.
>
> On Wed, May 17, 2017 at 2:55 PM, Stefan Priebe - Profihost AG
>  wrote:
>> Hello Jason,
>>
>> minutes ago i had another case where i restarted the osd which was shown
>> in objecter_requests output.
>>
>> It seems also other scrubs and deep scrubs were hanging.
>>
>> Output before:
>> 4095 active+clean
>>1 active+clean+scrubbing
>>
>> Output after restart:
>> 4084 active+clean
>>7 active+clean+scrubbing+deep
>>5 active+clean+scrubbing
>>
>> both values are changing every few seconds again doing a lot of scrubs
>> and deep scubs.
>>
>> Greets,
>> Stefan
>> Am 17.05.2017 um 20:36 schrieb Stefan Priebe - Profihost AG:
>>> Hi,
>>>
>>> that command does not exist.
>>>
>>> But at least ceph -s permanently reports 1 pg in scrubbing with no 
>>> change.
>>>
>>> Log attached as well.
>>>
>>> Greets,
>>> Stefan
>>> Am 17.05.2017 um 20:20 schrieb Jason Dillaman:
 Does your ceph status show pg 2.cebed0aa (still) scrubbing? Sure -- I
 can quickly scan the new log if you directly send it to me.

 On Wed, May 17, 2017 at 2:18 PM, Stefan Priebe - Profihost AG
  wrote:
> can send the osd log - if you want?
>
> Stefan
>
> Am 17.05.2017 um 20:13 schrieb Stefan Priebe - Profihost AG:
>> Hello Jason,
>>
>> the command
>> # rados -p cephstor6 rm rbd_data.21aafa6b8b4567.0aaa
>>
>> hangs as well. Doing absolutely nothing... waiting forever.
>>
>> Greets,
>> Stefan
>>
>> Am 17.05.2017 um 17:05 schrieb Jason Dillaman:
>>> OSD 23 notes that object rbd_data.21aafa6b8b4567.0aaa is
>>> waiting for a scrub. What happens if you run "rados -p  rm
>>> rbd_data.21aafa6b8b4567.0aaa" (capturing the OSD 23 logs
>>> during this)? If that succeeds while your VM remains blocked on that
>>> remove op, it looks like there is some problem in the OSD where ops
>>> queued on a scrub are not properly awoken when the scrub completes.
>>>
>>> On Wed, May 17, 2017 at 10:57 AM, Stefan Priebe - Profihost AG
>>>  wrote:
 Hello Jason,

 after enabling the log and generating a gcore dump, the request was
 successful ;-(

 So the log only contains the successfull request. So i was only 
 able to
 catch the successful request. I can send you the log on request.

 Luckily i had another VM on another Cluster behaving the same.

 This time osd.23:
 # ceph --admin-daemon
 

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-17 Thread Stefan Priebe - Profihost AG
No i can't reproduce it with active logs. Any furthr ideas?

Greets,
Stefan

Am 17.05.2017 um 21:26 schrieb Stefan Priebe - Profihost AG:
> Am 17.05.2017 um 21:21 schrieb Jason Dillaman:
>> Any chance you still have debug logs enabled on OSD 23 after you
>> restarted it and the scrub froze again? 
> 
> No but i can do that ;-) Hopefully it freezes again.
> 
> Stefan
> 
>>
>> On Wed, May 17, 2017 at 3:19 PM, Stefan Priebe - Profihost AG
>>  wrote:
>>> Hello,
>>>
>>> now it shows again:
> 4095 active+clean
>1 active+clean+scrubbing
>>>
>>> and:
>>> # ceph pg dump | grep -i scrub
>>> dumped all in format plain
>>> pg_stat objects mip degrmispunf bytes   log disklog
>>> state   state_stamp v   reportedup  up_primary
>>> acting  acting_primary  last_scrub  scrub_stamp last_deep_scrub
>>> deep_scrub_stamp
>>> 2.aa40400   0   0   0   10128667136 3010
>>> 3010active+clean+scrubbing  2017-05-11 09:37:37.962700
>>> 181936'11196478  181936:8688051  [23,41,9]   23  [23,41,9]
>>> 23  176730'10793226 2017-05-10 03:43:20.849784  171715'10548192
>>>2017-05-04 14:27:39.210713
>>>
>>> So it seems the same scrub is stuck again... even after restarting the
>>> osd. It just took some time until the scrub of this pg happened again.
>>>
>>> Greets,
>>> Stefan
>>> Am 17.05.2017 um 21:13 schrieb Jason Dillaman:
 Can you share your current OSD configuration? It's very curious that
 your scrub is getting randomly stuck on a few objects for hours at a
 time until an OSD is reset.

 On Wed, May 17, 2017 at 2:55 PM, Stefan Priebe - Profihost AG
  wrote:
> Hello Jason,
>
> minutes ago i had another case where i restarted the osd which was shown
> in objecter_requests output.
>
> It seems also other scrubs and deep scrubs were hanging.
>
> Output before:
> 4095 active+clean
>1 active+clean+scrubbing
>
> Output after restart:
> 4084 active+clean
>7 active+clean+scrubbing+deep
>5 active+clean+scrubbing
>
> both values are changing every few seconds again doing a lot of scrubs
> and deep scubs.
>
> Greets,
> Stefan
> Am 17.05.2017 um 20:36 schrieb Stefan Priebe - Profihost AG:
>> Hi,
>>
>> that command does not exist.
>>
>> But at least ceph -s permanently reports 1 pg in scrubbing with no 
>> change.
>>
>> Log attached as well.
>>
>> Greets,
>> Stefan
>> Am 17.05.2017 um 20:20 schrieb Jason Dillaman:
>>> Does your ceph status show pg 2.cebed0aa (still) scrubbing? Sure -- I
>>> can quickly scan the new log if you directly send it to me.
>>>
>>> On Wed, May 17, 2017 at 2:18 PM, Stefan Priebe - Profihost AG
>>>  wrote:
 can send the osd log - if you want?

 Stefan

 Am 17.05.2017 um 20:13 schrieb Stefan Priebe - Profihost AG:
> Hello Jason,
>
> the command
> # rados -p cephstor6 rm rbd_data.21aafa6b8b4567.0aaa
>
> hangs as well. Doing absolutely nothing... waiting forever.
>
> Greets,
> Stefan
>
> Am 17.05.2017 um 17:05 schrieb Jason Dillaman:
>> OSD 23 notes that object rbd_data.21aafa6b8b4567.0aaa is
>> waiting for a scrub. What happens if you run "rados -p  rm
>> rbd_data.21aafa6b8b4567.0aaa" (capturing the OSD 23 logs
>> during this)? If that succeeds while your VM remains blocked on that
>> remove op, it looks like there is some problem in the OSD where ops
>> queued on a scrub are not properly awoken when the scrub completes.
>>
>> On Wed, May 17, 2017 at 10:57 AM, Stefan Priebe - Profihost AG
>>  wrote:
>>> Hello Jason,
>>>
>>> after enabling the log and generating a gcore dump, the request was
>>> successful ;-(
>>>
>>> So the log only contains the successfull request. So i was only 
>>> able to
>>> catch the successful request. I can send you the log on request.
>>>
>>> Luckily i had another VM on another Cluster behaving the same.
>>>
>>> This time osd.23:
>>> # ceph --admin-daemon
>>> /var/run/ceph/ceph-client.admin.22969.140085040783360.asok
>>> objecter_requests
>>> {
>>> "ops": [
>>> {
>>> "tid": 18777,
>>> "pg": "2.cebed0aa",
>>> "osd": 23,
>>> "object_id": "rbd_data.21aafa6b8b4567.0aaa",
>>> "object_locator": "@2",
>>>

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-17 Thread Stefan Priebe - Profihost AG
Currently is does not issue a scrub again ;-(

Stefan

Am 17.05.2017 um 21:21 schrieb Jason Dillaman:
> Any chance you still have debug logs enabled on OSD 23 after you
> restarted it and the scrub froze again?
> 
> On Wed, May 17, 2017 at 3:19 PM, Stefan Priebe - Profihost AG
>  wrote:
>> Hello,
>>
>> now it shows again:
 4095 active+clean
1 active+clean+scrubbing
>>
>> and:
>> # ceph pg dump | grep -i scrub
>> dumped all in format plain
>> pg_stat objects mip degrmispunf bytes   log disklog
>> state   state_stamp v   reportedup  up_primary
>> acting  acting_primary  last_scrub  scrub_stamp last_deep_scrub
>> deep_scrub_stamp
>> 2.aa40400   0   0   0   10128667136 3010
>> 3010active+clean+scrubbing  2017-05-11 09:37:37.962700
>> 181936'11196478  181936:8688051  [23,41,9]   23  [23,41,9]
>> 23  176730'10793226 2017-05-10 03:43:20.849784  171715'10548192
>>2017-05-04 14:27:39.210713
>>
>> So it seems the same scrub is stuck again... even after restarting the
>> osd. It just took some time until the scrub of this pg happened again.
>>
>> Greets,
>> Stefan
>> Am 17.05.2017 um 21:13 schrieb Jason Dillaman:
>>> Can you share your current OSD configuration? It's very curious that
>>> your scrub is getting randomly stuck on a few objects for hours at a
>>> time until an OSD is reset.
>>>
>>> On Wed, May 17, 2017 at 2:55 PM, Stefan Priebe - Profihost AG
>>>  wrote:
 Hello Jason,

 minutes ago i had another case where i restarted the osd which was shown
 in objecter_requests output.

 It seems also other scrubs and deep scrubs were hanging.

 Output before:
 4095 active+clean
1 active+clean+scrubbing

 Output after restart:
 4084 active+clean
7 active+clean+scrubbing+deep
5 active+clean+scrubbing

 both values are changing every few seconds again doing a lot of scrubs
 and deep scubs.

 Greets,
 Stefan
 Am 17.05.2017 um 20:36 schrieb Stefan Priebe - Profihost AG:
> Hi,
>
> that command does not exist.
>
> But at least ceph -s permanently reports 1 pg in scrubbing with no change.
>
> Log attached as well.
>
> Greets,
> Stefan
> Am 17.05.2017 um 20:20 schrieb Jason Dillaman:
>> Does your ceph status show pg 2.cebed0aa (still) scrubbing? Sure -- I
>> can quickly scan the new log if you directly send it to me.
>>
>> On Wed, May 17, 2017 at 2:18 PM, Stefan Priebe - Profihost AG
>>  wrote:
>>> can send the osd log - if you want?
>>>
>>> Stefan
>>>
>>> Am 17.05.2017 um 20:13 schrieb Stefan Priebe - Profihost AG:
 Hello Jason,

 the command
 # rados -p cephstor6 rm rbd_data.21aafa6b8b4567.0aaa

 hangs as well. Doing absolutely nothing... waiting forever.

 Greets,
 Stefan

 Am 17.05.2017 um 17:05 schrieb Jason Dillaman:
> OSD 23 notes that object rbd_data.21aafa6b8b4567.0aaa is
> waiting for a scrub. What happens if you run "rados -p  rm
> rbd_data.21aafa6b8b4567.0aaa" (capturing the OSD 23 logs
> during this)? If that succeeds while your VM remains blocked on that
> remove op, it looks like there is some problem in the OSD where ops
> queued on a scrub are not properly awoken when the scrub completes.
>
> On Wed, May 17, 2017 at 10:57 AM, Stefan Priebe - Profihost AG
>  wrote:
>> Hello Jason,
>>
>> after enabling the log and generating a gcore dump, the request was
>> successful ;-(
>>
>> So the log only contains the successfull request. So i was only able 
>> to
>> catch the successful request. I can send you the log on request.
>>
>> Luckily i had another VM on another Cluster behaving the same.
>>
>> This time osd.23:
>> # ceph --admin-daemon
>> /var/run/ceph/ceph-client.admin.22969.140085040783360.asok
>> objecter_requests
>> {
>> "ops": [
>> {
>> "tid": 18777,
>> "pg": "2.cebed0aa",
>> "osd": 23,
>> "object_id": "rbd_data.21aafa6b8b4567.0aaa",
>> "object_locator": "@2",
>> "target_object_id": 
>> "rbd_data.21aafa6b8b4567.0aaa",
>> "target_object_locator": "@2",
>> "paused": 0,
>> "used_replica": 0,
>> "precalc_pgid": 0,
>> 

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-17 Thread Stefan Priebe - Profihost AG
Am 17.05.2017 um 21:21 schrieb Jason Dillaman:
> Any chance you still have debug logs enabled on OSD 23 after you
> restarted it and the scrub froze again? 

No but i can do that ;-) Hopefully it freezes again.

Stefan

> 
> On Wed, May 17, 2017 at 3:19 PM, Stefan Priebe - Profihost AG
>  wrote:
>> Hello,
>>
>> now it shows again:
 4095 active+clean
1 active+clean+scrubbing
>>
>> and:
>> # ceph pg dump | grep -i scrub
>> dumped all in format plain
>> pg_stat objects mip degrmispunf bytes   log disklog
>> state   state_stamp v   reportedup  up_primary
>> acting  acting_primary  last_scrub  scrub_stamp last_deep_scrub
>> deep_scrub_stamp
>> 2.aa40400   0   0   0   10128667136 3010
>> 3010active+clean+scrubbing  2017-05-11 09:37:37.962700
>> 181936'11196478  181936:8688051  [23,41,9]   23  [23,41,9]
>> 23  176730'10793226 2017-05-10 03:43:20.849784  171715'10548192
>>2017-05-04 14:27:39.210713
>>
>> So it seems the same scrub is stuck again... even after restarting the
>> osd. It just took some time until the scrub of this pg happened again.
>>
>> Greets,
>> Stefan
>> Am 17.05.2017 um 21:13 schrieb Jason Dillaman:
>>> Can you share your current OSD configuration? It's very curious that
>>> your scrub is getting randomly stuck on a few objects for hours at a
>>> time until an OSD is reset.
>>>
>>> On Wed, May 17, 2017 at 2:55 PM, Stefan Priebe - Profihost AG
>>>  wrote:
 Hello Jason,

 minutes ago i had another case where i restarted the osd which was shown
 in objecter_requests output.

 It seems also other scrubs and deep scrubs were hanging.

 Output before:
 4095 active+clean
1 active+clean+scrubbing

 Output after restart:
 4084 active+clean
7 active+clean+scrubbing+deep
5 active+clean+scrubbing

 both values are changing every few seconds again doing a lot of scrubs
 and deep scubs.

 Greets,
 Stefan
 Am 17.05.2017 um 20:36 schrieb Stefan Priebe - Profihost AG:
> Hi,
>
> that command does not exist.
>
> But at least ceph -s permanently reports 1 pg in scrubbing with no change.
>
> Log attached as well.
>
> Greets,
> Stefan
> Am 17.05.2017 um 20:20 schrieb Jason Dillaman:
>> Does your ceph status show pg 2.cebed0aa (still) scrubbing? Sure -- I
>> can quickly scan the new log if you directly send it to me.
>>
>> On Wed, May 17, 2017 at 2:18 PM, Stefan Priebe - Profihost AG
>>  wrote:
>>> can send the osd log - if you want?
>>>
>>> Stefan
>>>
>>> Am 17.05.2017 um 20:13 schrieb Stefan Priebe - Profihost AG:
 Hello Jason,

 the command
 # rados -p cephstor6 rm rbd_data.21aafa6b8b4567.0aaa

 hangs as well. Doing absolutely nothing... waiting forever.

 Greets,
 Stefan

 Am 17.05.2017 um 17:05 schrieb Jason Dillaman:
> OSD 23 notes that object rbd_data.21aafa6b8b4567.0aaa is
> waiting for a scrub. What happens if you run "rados -p  rm
> rbd_data.21aafa6b8b4567.0aaa" (capturing the OSD 23 logs
> during this)? If that succeeds while your VM remains blocked on that
> remove op, it looks like there is some problem in the OSD where ops
> queued on a scrub are not properly awoken when the scrub completes.
>
> On Wed, May 17, 2017 at 10:57 AM, Stefan Priebe - Profihost AG
>  wrote:
>> Hello Jason,
>>
>> after enabling the log and generating a gcore dump, the request was
>> successful ;-(
>>
>> So the log only contains the successfull request. So i was only able 
>> to
>> catch the successful request. I can send you the log on request.
>>
>> Luckily i had another VM on another Cluster behaving the same.
>>
>> This time osd.23:
>> # ceph --admin-daemon
>> /var/run/ceph/ceph-client.admin.22969.140085040783360.asok
>> objecter_requests
>> {
>> "ops": [
>> {
>> "tid": 18777,
>> "pg": "2.cebed0aa",
>> "osd": 23,
>> "object_id": "rbd_data.21aafa6b8b4567.0aaa",
>> "object_locator": "@2",
>> "target_object_id": 
>> "rbd_data.21aafa6b8b4567.0aaa",
>> "target_object_locator": "@2",
>> "paused": 0,
>> "used_replica": 0,
>> "precalc_pgid": 0,
>> 

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-17 Thread Jason Dillaman
Any chance you still have debug logs enabled on OSD 23 after you
restarted it and the scrub froze again?

On Wed, May 17, 2017 at 3:19 PM, Stefan Priebe - Profihost AG
 wrote:
> Hello,
>
> now it shows again:
>>> 4095 active+clean
>>>1 active+clean+scrubbing
>
> and:
> # ceph pg dump | grep -i scrub
> dumped all in format plain
> pg_stat objects mip degrmispunf bytes   log disklog
> state   state_stamp v   reportedup  up_primary
> acting  acting_primary  last_scrub  scrub_stamp last_deep_scrub
> deep_scrub_stamp
> 2.aa40400   0   0   0   10128667136 3010
> 3010active+clean+scrubbing  2017-05-11 09:37:37.962700
> 181936'11196478  181936:8688051  [23,41,9]   23  [23,41,9]
> 23  176730'10793226 2017-05-10 03:43:20.849784  171715'10548192
>2017-05-04 14:27:39.210713
>
> So it seems the same scrub is stuck again... even after restarting the
> osd. It just took some time until the scrub of this pg happened again.
>
> Greets,
> Stefan
> Am 17.05.2017 um 21:13 schrieb Jason Dillaman:
>> Can you share your current OSD configuration? It's very curious that
>> your scrub is getting randomly stuck on a few objects for hours at a
>> time until an OSD is reset.
>>
>> On Wed, May 17, 2017 at 2:55 PM, Stefan Priebe - Profihost AG
>>  wrote:
>>> Hello Jason,
>>>
>>> minutes ago i had another case where i restarted the osd which was shown
>>> in objecter_requests output.
>>>
>>> It seems also other scrubs and deep scrubs were hanging.
>>>
>>> Output before:
>>> 4095 active+clean
>>>1 active+clean+scrubbing
>>>
>>> Output after restart:
>>> 4084 active+clean
>>>7 active+clean+scrubbing+deep
>>>5 active+clean+scrubbing
>>>
>>> both values are changing every few seconds again doing a lot of scrubs
>>> and deep scubs.
>>>
>>> Greets,
>>> Stefan
>>> Am 17.05.2017 um 20:36 schrieb Stefan Priebe - Profihost AG:
 Hi,

 that command does not exist.

 But at least ceph -s permanently reports 1 pg in scrubbing with no change.

 Log attached as well.

 Greets,
 Stefan
 Am 17.05.2017 um 20:20 schrieb Jason Dillaman:
> Does your ceph status show pg 2.cebed0aa (still) scrubbing? Sure -- I
> can quickly scan the new log if you directly send it to me.
>
> On Wed, May 17, 2017 at 2:18 PM, Stefan Priebe - Profihost AG
>  wrote:
>> can send the osd log - if you want?
>>
>> Stefan
>>
>> Am 17.05.2017 um 20:13 schrieb Stefan Priebe - Profihost AG:
>>> Hello Jason,
>>>
>>> the command
>>> # rados -p cephstor6 rm rbd_data.21aafa6b8b4567.0aaa
>>>
>>> hangs as well. Doing absolutely nothing... waiting forever.
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 17.05.2017 um 17:05 schrieb Jason Dillaman:
 OSD 23 notes that object rbd_data.21aafa6b8b4567.0aaa is
 waiting for a scrub. What happens if you run "rados -p  rm
 rbd_data.21aafa6b8b4567.0aaa" (capturing the OSD 23 logs
 during this)? If that succeeds while your VM remains blocked on that
 remove op, it looks like there is some problem in the OSD where ops
 queued on a scrub are not properly awoken when the scrub completes.

 On Wed, May 17, 2017 at 10:57 AM, Stefan Priebe - Profihost AG
  wrote:
> Hello Jason,
>
> after enabling the log and generating a gcore dump, the request was
> successful ;-(
>
> So the log only contains the successfull request. So i was only able 
> to
> catch the successful request. I can send you the log on request.
>
> Luckily i had another VM on another Cluster behaving the same.
>
> This time osd.23:
> # ceph --admin-daemon
> /var/run/ceph/ceph-client.admin.22969.140085040783360.asok
> objecter_requests
> {
> "ops": [
> {
> "tid": 18777,
> "pg": "2.cebed0aa",
> "osd": 23,
> "object_id": "rbd_data.21aafa6b8b4567.0aaa",
> "object_locator": "@2",
> "target_object_id": 
> "rbd_data.21aafa6b8b4567.0aaa",
> "target_object_locator": "@2",
> "paused": 0,
> "used_replica": 0,
> "precalc_pgid": 0,
> "last_sent": "1.83513e+06s",
> "attempts": 1,
> "snapid": "head",
> "snap_context": "28a43=[]",
> "mtime": "2017-05-17 16:51:06.0.455475s",
> 

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-17 Thread Stefan Priebe - Profihost AG
Hello,

now it shows again:
>> 4095 active+clean
>>1 active+clean+scrubbing

and:
# ceph pg dump | grep -i scrub
dumped all in format plain
pg_stat objects mip degrmispunf bytes   log disklog
state   state_stamp v   reportedup  up_primary
acting  acting_primary  last_scrub  scrub_stamp last_deep_scrub
deep_scrub_stamp
2.aa40400   0   0   0   10128667136 3010
3010active+clean+scrubbing  2017-05-11 09:37:37.962700
181936'11196478  181936:8688051  [23,41,9]   23  [23,41,9]
23  176730'10793226 2017-05-10 03:43:20.849784  171715'10548192
   2017-05-04 14:27:39.210713

So it seems the same scrub is stuck again... even after restarting the
osd. It just took some time until the scrub of this pg happened again.

Greets,
Stefan
Am 17.05.2017 um 21:13 schrieb Jason Dillaman:
> Can you share your current OSD configuration? It's very curious that
> your scrub is getting randomly stuck on a few objects for hours at a
> time until an OSD is reset.
> 
> On Wed, May 17, 2017 at 2:55 PM, Stefan Priebe - Profihost AG
>  wrote:
>> Hello Jason,
>>
>> minutes ago i had another case where i restarted the osd which was shown
>> in objecter_requests output.
>>
>> It seems also other scrubs and deep scrubs were hanging.
>>
>> Output before:
>> 4095 active+clean
>>1 active+clean+scrubbing
>>
>> Output after restart:
>> 4084 active+clean
>>7 active+clean+scrubbing+deep
>>5 active+clean+scrubbing
>>
>> both values are changing every few seconds again doing a lot of scrubs
>> and deep scubs.
>>
>> Greets,
>> Stefan
>> Am 17.05.2017 um 20:36 schrieb Stefan Priebe - Profihost AG:
>>> Hi,
>>>
>>> that command does not exist.
>>>
>>> But at least ceph -s permanently reports 1 pg in scrubbing with no change.
>>>
>>> Log attached as well.
>>>
>>> Greets,
>>> Stefan
>>> Am 17.05.2017 um 20:20 schrieb Jason Dillaman:
 Does your ceph status show pg 2.cebed0aa (still) scrubbing? Sure -- I
 can quickly scan the new log if you directly send it to me.

 On Wed, May 17, 2017 at 2:18 PM, Stefan Priebe - Profihost AG
  wrote:
> can send the osd log - if you want?
>
> Stefan
>
> Am 17.05.2017 um 20:13 schrieb Stefan Priebe - Profihost AG:
>> Hello Jason,
>>
>> the command
>> # rados -p cephstor6 rm rbd_data.21aafa6b8b4567.0aaa
>>
>> hangs as well. Doing absolutely nothing... waiting forever.
>>
>> Greets,
>> Stefan
>>
>> Am 17.05.2017 um 17:05 schrieb Jason Dillaman:
>>> OSD 23 notes that object rbd_data.21aafa6b8b4567.0aaa is
>>> waiting for a scrub. What happens if you run "rados -p  rm
>>> rbd_data.21aafa6b8b4567.0aaa" (capturing the OSD 23 logs
>>> during this)? If that succeeds while your VM remains blocked on that
>>> remove op, it looks like there is some problem in the OSD where ops
>>> queued on a scrub are not properly awoken when the scrub completes.
>>>
>>> On Wed, May 17, 2017 at 10:57 AM, Stefan Priebe - Profihost AG
>>>  wrote:
 Hello Jason,

 after enabling the log and generating a gcore dump, the request was
 successful ;-(

 So the log only contains the successfull request. So i was only able to
 catch the successful request. I can send you the log on request.

 Luckily i had another VM on another Cluster behaving the same.

 This time osd.23:
 # ceph --admin-daemon
 /var/run/ceph/ceph-client.admin.22969.140085040783360.asok
 objecter_requests
 {
 "ops": [
 {
 "tid": 18777,
 "pg": "2.cebed0aa",
 "osd": 23,
 "object_id": "rbd_data.21aafa6b8b4567.0aaa",
 "object_locator": "@2",
 "target_object_id": 
 "rbd_data.21aafa6b8b4567.0aaa",
 "target_object_locator": "@2",
 "paused": 0,
 "used_replica": 0,
 "precalc_pgid": 0,
 "last_sent": "1.83513e+06s",
 "attempts": 1,
 "snapid": "head",
 "snap_context": "28a43=[]",
 "mtime": "2017-05-17 16:51:06.0.455475s",
 "osd_ops": [
 "delete"
 ]
 }
 ],
 "linger_ops": [
 {
 "linger_id": 1,
 "pg": "2.f0709c34",
 "osd": 23,
 "object_id": "rbd_header.21aafa6b8b4567",
 

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-17 Thread Jason Dillaman
Can you share your current OSD configuration? It's very curious that
your scrub is getting randomly stuck on a few objects for hours at a
time until an OSD is reset.

On Wed, May 17, 2017 at 2:55 PM, Stefan Priebe - Profihost AG
 wrote:
> Hello Jason,
>
> minutes ago i had another case where i restarted the osd which was shown
> in objecter_requests output.
>
> It seems also other scrubs and deep scrubs were hanging.
>
> Output before:
> 4095 active+clean
>1 active+clean+scrubbing
>
> Output after restart:
> 4084 active+clean
>7 active+clean+scrubbing+deep
>5 active+clean+scrubbing
>
> both values are changing every few seconds again doing a lot of scrubs
> and deep scubs.
>
> Greets,
> Stefan
> Am 17.05.2017 um 20:36 schrieb Stefan Priebe - Profihost AG:
>> Hi,
>>
>> that command does not exist.
>>
>> But at least ceph -s permanently reports 1 pg in scrubbing with no change.
>>
>> Log attached as well.
>>
>> Greets,
>> Stefan
>> Am 17.05.2017 um 20:20 schrieb Jason Dillaman:
>>> Does your ceph status show pg 2.cebed0aa (still) scrubbing? Sure -- I
>>> can quickly scan the new log if you directly send it to me.
>>>
>>> On Wed, May 17, 2017 at 2:18 PM, Stefan Priebe - Profihost AG
>>>  wrote:
 can send the osd log - if you want?

 Stefan

 Am 17.05.2017 um 20:13 schrieb Stefan Priebe - Profihost AG:
> Hello Jason,
>
> the command
> # rados -p cephstor6 rm rbd_data.21aafa6b8b4567.0aaa
>
> hangs as well. Doing absolutely nothing... waiting forever.
>
> Greets,
> Stefan
>
> Am 17.05.2017 um 17:05 schrieb Jason Dillaman:
>> OSD 23 notes that object rbd_data.21aafa6b8b4567.0aaa is
>> waiting for a scrub. What happens if you run "rados -p  rm
>> rbd_data.21aafa6b8b4567.0aaa" (capturing the OSD 23 logs
>> during this)? If that succeeds while your VM remains blocked on that
>> remove op, it looks like there is some problem in the OSD where ops
>> queued on a scrub are not properly awoken when the scrub completes.
>>
>> On Wed, May 17, 2017 at 10:57 AM, Stefan Priebe - Profihost AG
>>  wrote:
>>> Hello Jason,
>>>
>>> after enabling the log and generating a gcore dump, the request was
>>> successful ;-(
>>>
>>> So the log only contains the successfull request. So i was only able to
>>> catch the successful request. I can send you the log on request.
>>>
>>> Luckily i had another VM on another Cluster behaving the same.
>>>
>>> This time osd.23:
>>> # ceph --admin-daemon
>>> /var/run/ceph/ceph-client.admin.22969.140085040783360.asok
>>> objecter_requests
>>> {
>>> "ops": [
>>> {
>>> "tid": 18777,
>>> "pg": "2.cebed0aa",
>>> "osd": 23,
>>> "object_id": "rbd_data.21aafa6b8b4567.0aaa",
>>> "object_locator": "@2",
>>> "target_object_id": 
>>> "rbd_data.21aafa6b8b4567.0aaa",
>>> "target_object_locator": "@2",
>>> "paused": 0,
>>> "used_replica": 0,
>>> "precalc_pgid": 0,
>>> "last_sent": "1.83513e+06s",
>>> "attempts": 1,
>>> "snapid": "head",
>>> "snap_context": "28a43=[]",
>>> "mtime": "2017-05-17 16:51:06.0.455475s",
>>> "osd_ops": [
>>> "delete"
>>> ]
>>> }
>>> ],
>>> "linger_ops": [
>>> {
>>> "linger_id": 1,
>>> "pg": "2.f0709c34",
>>> "osd": 23,
>>> "object_id": "rbd_header.21aafa6b8b4567",
>>> "object_locator": "@2",
>>> "target_object_id": "rbd_header.21aafa6b8b4567",
>>> "target_object_locator": "@2",
>>> "paused": 0,
>>> "used_replica": 0,
>>> "precalc_pgid": 0,
>>> "snapid": "head",
>>> "registered": "1"
>>> }
>>> ],
>>> "pool_ops": [],
>>> "pool_stat_ops": [],
>>> "statfs_ops": [],
>>> "command_ops": []
>>> }
>>>
>>> OSD Logfile of OSD 23 attached.
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 17.05.2017 um 16:26 schrieb Jason Dillaman:
 On Wed, May 17, 2017 at 10:21 AM, Stefan Priebe - Profihost AG
  wrote:
> You mean the request no matter if it is successful or not? Which log
> level should be set to 20?


 I'm hoping you can re-create the hung remove op when OSD logging is
 increased -- "debug osd = 20" would be nice if you can turn it 

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-17 Thread Stefan Priebe - Profihost AG
Hello Jason,

minutes ago i had another case where i restarted the osd which was shown
in objecter_requests output.

It seems also other scrubs and deep scrubs were hanging.

Output before:
4095 active+clean
   1 active+clean+scrubbing

Output after restart:
4084 active+clean
   7 active+clean+scrubbing+deep
   5 active+clean+scrubbing

both values are changing every few seconds again doing a lot of scrubs
and deep scubs.

Greets,
Stefan
Am 17.05.2017 um 20:36 schrieb Stefan Priebe - Profihost AG:
> Hi,
> 
> that command does not exist.
> 
> But at least ceph -s permanently reports 1 pg in scrubbing with no change.
> 
> Log attached as well.
> 
> Greets,
> Stefan
> Am 17.05.2017 um 20:20 schrieb Jason Dillaman:
>> Does your ceph status show pg 2.cebed0aa (still) scrubbing? Sure -- I
>> can quickly scan the new log if you directly send it to me.
>>
>> On Wed, May 17, 2017 at 2:18 PM, Stefan Priebe - Profihost AG
>>  wrote:
>>> can send the osd log - if you want?
>>>
>>> Stefan
>>>
>>> Am 17.05.2017 um 20:13 schrieb Stefan Priebe - Profihost AG:
 Hello Jason,

 the command
 # rados -p cephstor6 rm rbd_data.21aafa6b8b4567.0aaa

 hangs as well. Doing absolutely nothing... waiting forever.

 Greets,
 Stefan

 Am 17.05.2017 um 17:05 schrieb Jason Dillaman:
> OSD 23 notes that object rbd_data.21aafa6b8b4567.0aaa is
> waiting for a scrub. What happens if you run "rados -p  rm
> rbd_data.21aafa6b8b4567.0aaa" (capturing the OSD 23 logs
> during this)? If that succeeds while your VM remains blocked on that
> remove op, it looks like there is some problem in the OSD where ops
> queued on a scrub are not properly awoken when the scrub completes.
>
> On Wed, May 17, 2017 at 10:57 AM, Stefan Priebe - Profihost AG
>  wrote:
>> Hello Jason,
>>
>> after enabling the log and generating a gcore dump, the request was
>> successful ;-(
>>
>> So the log only contains the successfull request. So i was only able to
>> catch the successful request. I can send you the log on request.
>>
>> Luckily i had another VM on another Cluster behaving the same.
>>
>> This time osd.23:
>> # ceph --admin-daemon
>> /var/run/ceph/ceph-client.admin.22969.140085040783360.asok
>> objecter_requests
>> {
>> "ops": [
>> {
>> "tid": 18777,
>> "pg": "2.cebed0aa",
>> "osd": 23,
>> "object_id": "rbd_data.21aafa6b8b4567.0aaa",
>> "object_locator": "@2",
>> "target_object_id": 
>> "rbd_data.21aafa6b8b4567.0aaa",
>> "target_object_locator": "@2",
>> "paused": 0,
>> "used_replica": 0,
>> "precalc_pgid": 0,
>> "last_sent": "1.83513e+06s",
>> "attempts": 1,
>> "snapid": "head",
>> "snap_context": "28a43=[]",
>> "mtime": "2017-05-17 16:51:06.0.455475s",
>> "osd_ops": [
>> "delete"
>> ]
>> }
>> ],
>> "linger_ops": [
>> {
>> "linger_id": 1,
>> "pg": "2.f0709c34",
>> "osd": 23,
>> "object_id": "rbd_header.21aafa6b8b4567",
>> "object_locator": "@2",
>> "target_object_id": "rbd_header.21aafa6b8b4567",
>> "target_object_locator": "@2",
>> "paused": 0,
>> "used_replica": 0,
>> "precalc_pgid": 0,
>> "snapid": "head",
>> "registered": "1"
>> }
>> ],
>> "pool_ops": [],
>> "pool_stat_ops": [],
>> "statfs_ops": [],
>> "command_ops": []
>> }
>>
>> OSD Logfile of OSD 23 attached.
>>
>> Greets,
>> Stefan
>>
>> Am 17.05.2017 um 16:26 schrieb Jason Dillaman:
>>> On Wed, May 17, 2017 at 10:21 AM, Stefan Priebe - Profihost AG
>>>  wrote:
 You mean the request no matter if it is successful or not? Which log
 level should be set to 20?
>>>
>>>
>>> I'm hoping you can re-create the hung remove op when OSD logging is
>>> increased -- "debug osd = 20" would be nice if you can turn it up that
>>> high while attempting to capture the blocked op.
>>>
>
>
>
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>>
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-17 Thread Jason Dillaman
Does your ceph status show pg 2.cebed0aa (still) scrubbing? Sure -- I
can quickly scan the new log if you directly send it to me.

On Wed, May 17, 2017 at 2:18 PM, Stefan Priebe - Profihost AG
 wrote:
> can send the osd log - if you want?
>
> Stefan
>
> Am 17.05.2017 um 20:13 schrieb Stefan Priebe - Profihost AG:
>> Hello Jason,
>>
>> the command
>> # rados -p cephstor6 rm rbd_data.21aafa6b8b4567.0aaa
>>
>> hangs as well. Doing absolutely nothing... waiting forever.
>>
>> Greets,
>> Stefan
>>
>> Am 17.05.2017 um 17:05 schrieb Jason Dillaman:
>>> OSD 23 notes that object rbd_data.21aafa6b8b4567.0aaa is
>>> waiting for a scrub. What happens if you run "rados -p  rm
>>> rbd_data.21aafa6b8b4567.0aaa" (capturing the OSD 23 logs
>>> during this)? If that succeeds while your VM remains blocked on that
>>> remove op, it looks like there is some problem in the OSD where ops
>>> queued on a scrub are not properly awoken when the scrub completes.
>>>
>>> On Wed, May 17, 2017 at 10:57 AM, Stefan Priebe - Profihost AG
>>>  wrote:
 Hello Jason,

 after enabling the log and generating a gcore dump, the request was
 successful ;-(

 So the log only contains the successfull request. So i was only able to
 catch the successful request. I can send you the log on request.

 Luckily i had another VM on another Cluster behaving the same.

 This time osd.23:
 # ceph --admin-daemon
 /var/run/ceph/ceph-client.admin.22969.140085040783360.asok
 objecter_requests
 {
 "ops": [
 {
 "tid": 18777,
 "pg": "2.cebed0aa",
 "osd": 23,
 "object_id": "rbd_data.21aafa6b8b4567.0aaa",
 "object_locator": "@2",
 "target_object_id": "rbd_data.21aafa6b8b4567.0aaa",
 "target_object_locator": "@2",
 "paused": 0,
 "used_replica": 0,
 "precalc_pgid": 0,
 "last_sent": "1.83513e+06s",
 "attempts": 1,
 "snapid": "head",
 "snap_context": "28a43=[]",
 "mtime": "2017-05-17 16:51:06.0.455475s",
 "osd_ops": [
 "delete"
 ]
 }
 ],
 "linger_ops": [
 {
 "linger_id": 1,
 "pg": "2.f0709c34",
 "osd": 23,
 "object_id": "rbd_header.21aafa6b8b4567",
 "object_locator": "@2",
 "target_object_id": "rbd_header.21aafa6b8b4567",
 "target_object_locator": "@2",
 "paused": 0,
 "used_replica": 0,
 "precalc_pgid": 0,
 "snapid": "head",
 "registered": "1"
 }
 ],
 "pool_ops": [],
 "pool_stat_ops": [],
 "statfs_ops": [],
 "command_ops": []
 }

 OSD Logfile of OSD 23 attached.

 Greets,
 Stefan

 Am 17.05.2017 um 16:26 schrieb Jason Dillaman:
> On Wed, May 17, 2017 at 10:21 AM, Stefan Priebe - Profihost AG
>  wrote:
>> You mean the request no matter if it is successful or not? Which log
>> level should be set to 20?
>
>
> I'm hoping you can re-create the hung remove op when OSD logging is
> increased -- "debug osd = 20" would be nice if you can turn it up that
> high while attempting to capture the blocked op.
>
>>>
>>>
>>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-17 Thread Stefan Priebe - Profihost AG
can send the osd log - if you want?

Stefan

Am 17.05.2017 um 20:13 schrieb Stefan Priebe - Profihost AG:
> Hello Jason,
> 
> the command
> # rados -p cephstor6 rm rbd_data.21aafa6b8b4567.0aaa
> 
> hangs as well. Doing absolutely nothing... waiting forever.
> 
> Greets,
> Stefan
> 
> Am 17.05.2017 um 17:05 schrieb Jason Dillaman:
>> OSD 23 notes that object rbd_data.21aafa6b8b4567.0aaa is
>> waiting for a scrub. What happens if you run "rados -p  rm
>> rbd_data.21aafa6b8b4567.0aaa" (capturing the OSD 23 logs
>> during this)? If that succeeds while your VM remains blocked on that
>> remove op, it looks like there is some problem in the OSD where ops
>> queued on a scrub are not properly awoken when the scrub completes.
>>
>> On Wed, May 17, 2017 at 10:57 AM, Stefan Priebe - Profihost AG
>>  wrote:
>>> Hello Jason,
>>>
>>> after enabling the log and generating a gcore dump, the request was
>>> successful ;-(
>>>
>>> So the log only contains the successfull request. So i was only able to
>>> catch the successful request. I can send you the log on request.
>>>
>>> Luckily i had another VM on another Cluster behaving the same.
>>>
>>> This time osd.23:
>>> # ceph --admin-daemon
>>> /var/run/ceph/ceph-client.admin.22969.140085040783360.asok
>>> objecter_requests
>>> {
>>> "ops": [
>>> {
>>> "tid": 18777,
>>> "pg": "2.cebed0aa",
>>> "osd": 23,
>>> "object_id": "rbd_data.21aafa6b8b4567.0aaa",
>>> "object_locator": "@2",
>>> "target_object_id": "rbd_data.21aafa6b8b4567.0aaa",
>>> "target_object_locator": "@2",
>>> "paused": 0,
>>> "used_replica": 0,
>>> "precalc_pgid": 0,
>>> "last_sent": "1.83513e+06s",
>>> "attempts": 1,
>>> "snapid": "head",
>>> "snap_context": "28a43=[]",
>>> "mtime": "2017-05-17 16:51:06.0.455475s",
>>> "osd_ops": [
>>> "delete"
>>> ]
>>> }
>>> ],
>>> "linger_ops": [
>>> {
>>> "linger_id": 1,
>>> "pg": "2.f0709c34",
>>> "osd": 23,
>>> "object_id": "rbd_header.21aafa6b8b4567",
>>> "object_locator": "@2",
>>> "target_object_id": "rbd_header.21aafa6b8b4567",
>>> "target_object_locator": "@2",
>>> "paused": 0,
>>> "used_replica": 0,
>>> "precalc_pgid": 0,
>>> "snapid": "head",
>>> "registered": "1"
>>> }
>>> ],
>>> "pool_ops": [],
>>> "pool_stat_ops": [],
>>> "statfs_ops": [],
>>> "command_ops": []
>>> }
>>>
>>> OSD Logfile of OSD 23 attached.
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 17.05.2017 um 16:26 schrieb Jason Dillaman:
 On Wed, May 17, 2017 at 10:21 AM, Stefan Priebe - Profihost AG
  wrote:
> You mean the request no matter if it is successful or not? Which log
> level should be set to 20?


 I'm hoping you can re-create the hung remove op when OSD logging is
 increased -- "debug osd = 20" would be nice if you can turn it up that
 high while attempting to capture the blocked op.

>>
>>
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-17 Thread Stefan Priebe - Profihost AG
Hello Jason,

the command
# rados -p cephstor6 rm rbd_data.21aafa6b8b4567.0aaa

hangs as well. Doing absolutely nothing... waiting forever.

Greets,
Stefan

Am 17.05.2017 um 17:05 schrieb Jason Dillaman:
> OSD 23 notes that object rbd_data.21aafa6b8b4567.0aaa is
> waiting for a scrub. What happens if you run "rados -p  rm
> rbd_data.21aafa6b8b4567.0aaa" (capturing the OSD 23 logs
> during this)? If that succeeds while your VM remains blocked on that
> remove op, it looks like there is some problem in the OSD where ops
> queued on a scrub are not properly awoken when the scrub completes.
> 
> On Wed, May 17, 2017 at 10:57 AM, Stefan Priebe - Profihost AG
>  wrote:
>> Hello Jason,
>>
>> after enabling the log and generating a gcore dump, the request was
>> successful ;-(
>> 
>> So the log only contains the successfull request. So i was only able to
>> catch the successful request. I can send you the log on request.
>>
>> Luckily i had another VM on another Cluster behaving the same.
>>
>> This time osd.23:
>> # ceph --admin-daemon
>> /var/run/ceph/ceph-client.admin.22969.140085040783360.asok
>> objecter_requests
>> {
>> "ops": [
>> {
>> "tid": 18777,
>> "pg": "2.cebed0aa",
>> "osd": 23,
>> "object_id": "rbd_data.21aafa6b8b4567.0aaa",
>> "object_locator": "@2",
>> "target_object_id": "rbd_data.21aafa6b8b4567.0aaa",
>> "target_object_locator": "@2",
>> "paused": 0,
>> "used_replica": 0,
>> "precalc_pgid": 0,
>> "last_sent": "1.83513e+06s",
>> "attempts": 1,
>> "snapid": "head",
>> "snap_context": "28a43=[]",
>> "mtime": "2017-05-17 16:51:06.0.455475s",
>> "osd_ops": [
>> "delete"
>> ]
>> }
>> ],
>> "linger_ops": [
>> {
>> "linger_id": 1,
>> "pg": "2.f0709c34",
>> "osd": 23,
>> "object_id": "rbd_header.21aafa6b8b4567",
>> "object_locator": "@2",
>> "target_object_id": "rbd_header.21aafa6b8b4567",
>> "target_object_locator": "@2",
>> "paused": 0,
>> "used_replica": 0,
>> "precalc_pgid": 0,
>> "snapid": "head",
>> "registered": "1"
>> }
>> ],
>> "pool_ops": [],
>> "pool_stat_ops": [],
>> "statfs_ops": [],
>> "command_ops": []
>> }
>>
>> OSD Logfile of OSD 23 attached.
>>
>> Greets,
>> Stefan
>>
>> Am 17.05.2017 um 16:26 schrieb Jason Dillaman:
>>> On Wed, May 17, 2017 at 10:21 AM, Stefan Priebe - Profihost AG
>>>  wrote:
 You mean the request no matter if it is successful or not? Which log
 level should be set to 20?
>>>
>>>
>>> I'm hoping you can re-create the hung remove op when OSD logging is
>>> increased -- "debug osd = 20" would be nice if you can turn it up that
>>> high while attempting to capture the blocked op.
>>>
> 
> 
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-17 Thread Stefan Priebe - Profihost AG
Ah no wrong thread. Will Test your Suggestion 

Stefan

Excuse my typo sent from my mobile phone.

> Am 17.05.2017 um 17:05 schrieb Jason Dillaman :
> 
> OSD 23 notes that object rbd_data.21aafa6b8b4567.0aaa is
> waiting for a scrub. What happens if you run "rados -p  rm
> rbd_data.21aafa6b8b4567.0aaa" (capturing the OSD 23 logs
> during this)? If that succeeds while your VM remains blocked on that
> remove op, it looks like there is some problem in the OSD where ops
> queued on a scrub are not properly awoken when the scrub completes.
> 
> On Wed, May 17, 2017 at 10:57 AM, Stefan Priebe - Profihost AG
>  wrote:
>> Hello Jason,
>> 
>> after enabling the log and generating a gcore dump, the request was
>> successful ;-(
>> 
>> So the log only contains the successfull request. So i was only able to
>> catch the successful request. I can send you the log on request.
>> 
>> Luckily i had another VM on another Cluster behaving the same.
>> 
>> This time osd.23:
>> # ceph --admin-daemon
>> /var/run/ceph/ceph-client.admin.22969.140085040783360.asok
>> objecter_requests
>> {
>>"ops": [
>>{
>>"tid": 18777,
>>"pg": "2.cebed0aa",
>>"osd": 23,
>>"object_id": "rbd_data.21aafa6b8b4567.0aaa",
>>"object_locator": "@2",
>>"target_object_id": "rbd_data.21aafa6b8b4567.0aaa",
>>"target_object_locator": "@2",
>>"paused": 0,
>>"used_replica": 0,
>>"precalc_pgid": 0,
>>"last_sent": "1.83513e+06s",
>>"attempts": 1,
>>"snapid": "head",
>>"snap_context": "28a43=[]",
>>"mtime": "2017-05-17 16:51:06.0.455475s",
>>"osd_ops": [
>>"delete"
>>]
>>}
>>],
>>"linger_ops": [
>>{
>>"linger_id": 1,
>>"pg": "2.f0709c34",
>>"osd": 23,
>>"object_id": "rbd_header.21aafa6b8b4567",
>>"object_locator": "@2",
>>"target_object_id": "rbd_header.21aafa6b8b4567",
>>"target_object_locator": "@2",
>>"paused": 0,
>>"used_replica": 0,
>>"precalc_pgid": 0,
>>"snapid": "head",
>>"registered": "1"
>>}
>>],
>>"pool_ops": [],
>>"pool_stat_ops": [],
>>"statfs_ops": [],
>>"command_ops": []
>> }
>> 
>> OSD Logfile of OSD 23 attached.
>> 
>> Greets,
>> Stefan
>> 
>>> Am 17.05.2017 um 16:26 schrieb Jason Dillaman:
>>> On Wed, May 17, 2017 at 10:21 AM, Stefan Priebe - Profihost AG
>>>  wrote:
 You mean the request no matter if it is successful or not? Which log
 level should be set to 20?
>>> 
>>> 
>>> I'm hoping you can re-create the hung remove op when OSD logging is
>>> increased -- "debug osd = 20" would be nice if you can turn it up that
>>> high while attempting to capture the blocked op.
>>> 
> 
> 
> 
> -- 
> Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-17 Thread Stefan Priebe - Profihost AG
Can Test in 2 hours but it sounds like 
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-December/014773.html

Stefan

Excuse my typo sent from my mobile phone.

> Am 17.05.2017 um 17:05 schrieb Jason Dillaman :
> 
> OSD 23 notes that object rbd_data.21aafa6b8b4567.0aaa is
> waiting for a scrub. What happens if you run "rados -p  rm
> rbd_data.21aafa6b8b4567.0aaa" (capturing the OSD 23 logs
> during this)? If that succeeds while your VM remains blocked on that
> remove op, it looks like there is some problem in the OSD where ops
> queued on a scrub are not properly awoken when the scrub completes.
> 
> On Wed, May 17, 2017 at 10:57 AM, Stefan Priebe - Profihost AG
>  wrote:
>> Hello Jason,
>> 
>> after enabling the log and generating a gcore dump, the request was
>> successful ;-(
>> 
>> So the log only contains the successfull request. So i was only able to
>> catch the successful request. I can send you the log on request.
>> 
>> Luckily i had another VM on another Cluster behaving the same.
>> 
>> This time osd.23:
>> # ceph --admin-daemon
>> /var/run/ceph/ceph-client.admin.22969.140085040783360.asok
>> objecter_requests
>> {
>>"ops": [
>>{
>>"tid": 18777,
>>"pg": "2.cebed0aa",
>>"osd": 23,
>>"object_id": "rbd_data.21aafa6b8b4567.0aaa",
>>"object_locator": "@2",
>>"target_object_id": "rbd_data.21aafa6b8b4567.0aaa",
>>"target_object_locator": "@2",
>>"paused": 0,
>>"used_replica": 0,
>>"precalc_pgid": 0,
>>"last_sent": "1.83513e+06s",
>>"attempts": 1,
>>"snapid": "head",
>>"snap_context": "28a43=[]",
>>"mtime": "2017-05-17 16:51:06.0.455475s",
>>"osd_ops": [
>>"delete"
>>]
>>}
>>],
>>"linger_ops": [
>>{
>>"linger_id": 1,
>>"pg": "2.f0709c34",
>>"osd": 23,
>>"object_id": "rbd_header.21aafa6b8b4567",
>>"object_locator": "@2",
>>"target_object_id": "rbd_header.21aafa6b8b4567",
>>"target_object_locator": "@2",
>>"paused": 0,
>>"used_replica": 0,
>>"precalc_pgid": 0,
>>"snapid": "head",
>>"registered": "1"
>>}
>>],
>>"pool_ops": [],
>>"pool_stat_ops": [],
>>"statfs_ops": [],
>>"command_ops": []
>> }
>> 
>> OSD Logfile of OSD 23 attached.
>> 
>> Greets,
>> Stefan
>> 
>>> Am 17.05.2017 um 16:26 schrieb Jason Dillaman:
>>> On Wed, May 17, 2017 at 10:21 AM, Stefan Priebe - Profihost AG
>>>  wrote:
 You mean the request no matter if it is successful or not? Which log
 level should be set to 20?
>>> 
>>> 
>>> I'm hoping you can re-create the hung remove op when OSD logging is
>>> increased -- "debug osd = 20" would be nice if you can turn it up that
>>> high while attempting to capture the blocked op.
>>> 
> 
> 
> 
> -- 
> Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-17 Thread Jason Dillaman
OSD 23 notes that object rbd_data.21aafa6b8b4567.0aaa is
waiting for a scrub. What happens if you run "rados -p  rm
rbd_data.21aafa6b8b4567.0aaa" (capturing the OSD 23 logs
during this)? If that succeeds while your VM remains blocked on that
remove op, it looks like there is some problem in the OSD where ops
queued on a scrub are not properly awoken when the scrub completes.

On Wed, May 17, 2017 at 10:57 AM, Stefan Priebe - Profihost AG
 wrote:
> Hello Jason,
>
> after enabling the log and generating a gcore dump, the request was
> successful ;-(
>
> So the log only contains the successfull request. So i was only able to
> catch the successful request. I can send you the log on request.
>
> Luckily i had another VM on another Cluster behaving the same.
>
> This time osd.23:
> # ceph --admin-daemon
> /var/run/ceph/ceph-client.admin.22969.140085040783360.asok
> objecter_requests
> {
> "ops": [
> {
> "tid": 18777,
> "pg": "2.cebed0aa",
> "osd": 23,
> "object_id": "rbd_data.21aafa6b8b4567.0aaa",
> "object_locator": "@2",
> "target_object_id": "rbd_data.21aafa6b8b4567.0aaa",
> "target_object_locator": "@2",
> "paused": 0,
> "used_replica": 0,
> "precalc_pgid": 0,
> "last_sent": "1.83513e+06s",
> "attempts": 1,
> "snapid": "head",
> "snap_context": "28a43=[]",
> "mtime": "2017-05-17 16:51:06.0.455475s",
> "osd_ops": [
> "delete"
> ]
> }
> ],
> "linger_ops": [
> {
> "linger_id": 1,
> "pg": "2.f0709c34",
> "osd": 23,
> "object_id": "rbd_header.21aafa6b8b4567",
> "object_locator": "@2",
> "target_object_id": "rbd_header.21aafa6b8b4567",
> "target_object_locator": "@2",
> "paused": 0,
> "used_replica": 0,
> "precalc_pgid": 0,
> "snapid": "head",
> "registered": "1"
> }
> ],
> "pool_ops": [],
> "pool_stat_ops": [],
> "statfs_ops": [],
> "command_ops": []
> }
>
> OSD Logfile of OSD 23 attached.
>
> Greets,
> Stefan
>
> Am 17.05.2017 um 16:26 schrieb Jason Dillaman:
>> On Wed, May 17, 2017 at 10:21 AM, Stefan Priebe - Profihost AG
>>  wrote:
>>> You mean the request no matter if it is successful or not? Which log
>>> level should be set to 20?
>>
>>
>> I'm hoping you can re-create the hung remove op when OSD logging is
>> increased -- "debug osd = 20" would be nice if you can turn it up that
>> high while attempting to capture the blocked op.
>>



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-17 Thread Jason Dillaman
On Wed, May 17, 2017 at 10:25 AM, Stefan Priebe - Profihost AG
 wrote:
> issue the delete request and send you the log?


Yes, please.

-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-17 Thread Jason Dillaman
On Wed, May 17, 2017 at 10:21 AM, Stefan Priebe - Profihost AG
 wrote:
> You mean the request no matter if it is successful or not? Which log
> level should be set to 20?


I'm hoping you can re-create the hung remove op when OSD logging is
increased -- "debug osd = 20" would be nice if you can turn it up that
high while attempting to capture the blocked op.

-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-17 Thread Stefan Priebe - Profihost AG
I can set the following debug levels for osd 46:

ceph --admin-daemon /var/run/ceph/ceph-osd.$1.asok config set debug_osd 20
ceph --admin-daemon /var/run/ceph/ceph-osd.$1.asok config set
debug_filestore 20
ceph --admin-daemon /var/run/ceph/ceph-osd.$1.asok config set debug_ms 1
ceph --admin-daemon /var/run/ceph/ceph-osd.$1.asok config set
debug_optracker 20

issue the delete request and send you the log?

Greets,
Stefan

Am 17.05.2017 um 16:07 schrieb Jason Dillaman:
> The internals of the OSD is not exactly my area of expertise. Since
> you have the op tracker disabled and I'm assuming your cluster health
> is OK, perhaps you could run "gcore " to preserve its
> current state at a bare minimum. Then, assuming your can restart vm191
> and re-issue the fstrim that blocks, perhaps you could increase the
> debug log level of OSD 46 and see if you can capture the remove
> request for rbd_data.e10ca56b8b4567.311c.
> 
> On Wed, May 17, 2017 at 9:44 AM, Stefan Priebe - Profihost AG
>  wrote:
>> Hello Jason,
>>
>> any idea how to debug this further? dmesg does not show any disk
>> failures. Smart values are also OK. There's also no xfs BUG or WARNING
>> from the kernel side.
>>
>> I'm sure that it will work after restarting osd.46 - but i'm loosing the
>> ability to reproduce this in that case.
>>
>> Should i install debug symbols and issue a gdb: "thread apply all bt full"?
>>
>> Does it help?
>>
>> Greets,
>> Stefan
>>
>> Am 17.05.2017 um 15:12 schrieb Jason Dillaman:
>>> Perfect librbd log capture. I can see that a remove request to object
>>> rbd_data.e10ca56b8b4567.311c was issued but it never
>>> completed. This results in a hung discard and flush request.
>>>
>>> Assuming that object is still mapped to OSD 46, I think there is
>>> either something happening with that OSD or its backing disk that is
>>> preventing the remove for that object from completing.
>>>
>>> On Wed, May 17, 2017 at 6:04 AM, Stefan Priebe - Profihost AG
>>>  wrote:
 Hello Jason,

 attached is a rbd debug 20 log from the situation when the deadlock
 occours. I hope this helps.

 The VM was running until 2017-05-17 12:10 but there was no I/O for 10 min.

 Greets,
 Stefan

 Am 16.05.2017 um 22:54 schrieb Jason Dillaman:
> It looks like it's just a ping message in that capture.
>
> Are you saying that you restarted OSD 46 and the problem persisted?
>
> On Tue, May 16, 2017 at 4:02 PM, Stefan Priebe - Profihost AG
>  wrote:
>> Hello,
>>
>> while reproducing the problem, objecter_requests looks like this:
>>
>> {
>> "ops": [
>> {
>> "tid": 42029,
>> "pg": "5.bd9616ad",
>> "osd": 46,
>> "object_id": "rbd_data.e10ca56b8b4567.311c",
>> "object_locator": "@5",
>> "target_object_id": 
>> "rbd_data.e10ca56b8b4567.311c",
>> "target_object_locator": "@5",
>> "paused": 0,
>> "used_replica": 0,
>> "precalc_pgid": 0,
>> "last_sent": "2.28854e+06s",
>> "attempts": 1,
>> "snapid": "head",
>> "snap_context": "a07c2=[]",
>> "mtime": "2017-05-16 21:53:22.0.069541s",
>> "osd_ops": [
>> "delete"
>> ]
>> }
>> ],
>> "linger_ops": [
>> {
>> "linger_id": 1,
>> "pg": "5.5f3bd635",
>> "osd": 17,
>> "object_id": "rbd_header.e10ca56b8b4567",
>> "object_locator": "@5",
>> "target_object_id": "rbd_header.e10ca56b8b4567",
>> "target_object_locator": "@5",
>> "paused": 0,
>> "used_replica": 0,
>> "precalc_pgid": 0,
>> "snapid": "head",
>> "registered": "1"
>> }
>> ],
>> "pool_ops": [],
>> "pool_stat_ops": [],
>> "statfs_ops": [],
>> "command_ops": []
>> }
>>
>> Yes they've an established TCP connection. Qemu <=> osd.46. Attached is
>> a pcap file of the traffic between them when it got stuck.
>>
>> Greets,
>> Stefan
>>
>> Am 16.05.2017 um 21:45 schrieb Jason Dillaman:
>>> On Tue, May 16, 2017 at 3:37 PM, Stefan Priebe - Profihost AG
>>>  wrote:
 We've enabled the op tracker for performance reasons while using SSD
 only storage ;-(
>>>
>>> Disabled you mean?
>>>
 Can enable the op tracker using ceph osd tell? Than reproduce the
 problem. Check what has stucked again? Or should i generate an rbd log
 from the client?
>>>
>>> From a super-quick glance at the 

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-17 Thread Stefan Priebe - Profihost AG
Am 17.05.2017 um 16:07 schrieb Jason Dillaman:
> The internals of the OSD is not exactly my area of expertise. Since
> you have the op tracker disabled and I'm assuming your cluster health
> is OK, perhaps you could run "gcore " to preserve its
> current state at a bare minimum. Then, assuming your can restart vm191
> and re-issue the fstrim that blocks, perhaps you could increase the
> debug log level of OSD 46 and see if you can capture the remove
> request for rbd_data.e10ca56b8b4567.311c.

You mean the request no matter if it is successful or not? Which log
level should be set to 20?

Greets,
Stefan

> 
> On Wed, May 17, 2017 at 9:44 AM, Stefan Priebe - Profihost AG
>  wrote:
>> Hello Jason,
>>
>> any idea how to debug this further? dmesg does not show any disk
>> failures. Smart values are also OK. There's also no xfs BUG or WARNING
>> from the kernel side.
>>
>> I'm sure that it will work after restarting osd.46 - but i'm loosing the
>> ability to reproduce this in that case.
>>
>> Should i install debug symbols and issue a gdb: "thread apply all bt full"?
>>
>> Does it help?
>>
>> Greets,
>> Stefan
>>
>> Am 17.05.2017 um 15:12 schrieb Jason Dillaman:
>>> Perfect librbd log capture. I can see that a remove request to object
>>> rbd_data.e10ca56b8b4567.311c was issued but it never
>>> completed. This results in a hung discard and flush request.
>>>
>>> Assuming that object is still mapped to OSD 46, I think there is
>>> either something happening with that OSD or its backing disk that is
>>> preventing the remove for that object from completing.
>>>
>>> On Wed, May 17, 2017 at 6:04 AM, Stefan Priebe - Profihost AG
>>>  wrote:
 Hello Jason,

 attached is a rbd debug 20 log from the situation when the deadlock
 occours. I hope this helps.

 The VM was running until 2017-05-17 12:10 but there was no I/O for 10 min.

 Greets,
 Stefan

 Am 16.05.2017 um 22:54 schrieb Jason Dillaman:
> It looks like it's just a ping message in that capture.
>
> Are you saying that you restarted OSD 46 and the problem persisted?
>
> On Tue, May 16, 2017 at 4:02 PM, Stefan Priebe - Profihost AG
>  wrote:
>> Hello,
>>
>> while reproducing the problem, objecter_requests looks like this:
>>
>> {
>> "ops": [
>> {
>> "tid": 42029,
>> "pg": "5.bd9616ad",
>> "osd": 46,
>> "object_id": "rbd_data.e10ca56b8b4567.311c",
>> "object_locator": "@5",
>> "target_object_id": 
>> "rbd_data.e10ca56b8b4567.311c",
>> "target_object_locator": "@5",
>> "paused": 0,
>> "used_replica": 0,
>> "precalc_pgid": 0,
>> "last_sent": "2.28854e+06s",
>> "attempts": 1,
>> "snapid": "head",
>> "snap_context": "a07c2=[]",
>> "mtime": "2017-05-16 21:53:22.0.069541s",
>> "osd_ops": [
>> "delete"
>> ]
>> }
>> ],
>> "linger_ops": [
>> {
>> "linger_id": 1,
>> "pg": "5.5f3bd635",
>> "osd": 17,
>> "object_id": "rbd_header.e10ca56b8b4567",
>> "object_locator": "@5",
>> "target_object_id": "rbd_header.e10ca56b8b4567",
>> "target_object_locator": "@5",
>> "paused": 0,
>> "used_replica": 0,
>> "precalc_pgid": 0,
>> "snapid": "head",
>> "registered": "1"
>> }
>> ],
>> "pool_ops": [],
>> "pool_stat_ops": [],
>> "statfs_ops": [],
>> "command_ops": []
>> }
>>
>> Yes they've an established TCP connection. Qemu <=> osd.46. Attached is
>> a pcap file of the traffic between them when it got stuck.
>>
>> Greets,
>> Stefan
>>
>> Am 16.05.2017 um 21:45 schrieb Jason Dillaman:
>>> On Tue, May 16, 2017 at 3:37 PM, Stefan Priebe - Profihost AG
>>>  wrote:
 We've enabled the op tracker for performance reasons while using SSD
 only storage ;-(
>>>
>>> Disabled you mean?
>>>
 Can enable the op tracker using ceph osd tell? Than reproduce the
 problem. Check what has stucked again? Or should i generate an rbd log
 from the client?
>>>
>>> From a super-quick glance at the code, it looks like that isn't a
>>> dynamic setting. Of course, it's possible that if you restart OSD 46
>>> to enable the op tracker, the stuck op will clear itself and the VM
>>> will resume. You could attempt to generate a gcore of OSD 46 to see if
>>> information on that op could be 

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-17 Thread Jason Dillaman
The internals of the OSD is not exactly my area of expertise. Since
you have the op tracker disabled and I'm assuming your cluster health
is OK, perhaps you could run "gcore " to preserve its
current state at a bare minimum. Then, assuming your can restart vm191
and re-issue the fstrim that blocks, perhaps you could increase the
debug log level of OSD 46 and see if you can capture the remove
request for rbd_data.e10ca56b8b4567.311c.

On Wed, May 17, 2017 at 9:44 AM, Stefan Priebe - Profihost AG
 wrote:
> Hello Jason,
>
> any idea how to debug this further? dmesg does not show any disk
> failures. Smart values are also OK. There's also no xfs BUG or WARNING
> from the kernel side.
>
> I'm sure that it will work after restarting osd.46 - but i'm loosing the
> ability to reproduce this in that case.
>
> Should i install debug symbols and issue a gdb: "thread apply all bt full"?
>
> Does it help?
>
> Greets,
> Stefan
>
> Am 17.05.2017 um 15:12 schrieb Jason Dillaman:
>> Perfect librbd log capture. I can see that a remove request to object
>> rbd_data.e10ca56b8b4567.311c was issued but it never
>> completed. This results in a hung discard and flush request.
>>
>> Assuming that object is still mapped to OSD 46, I think there is
>> either something happening with that OSD or its backing disk that is
>> preventing the remove for that object from completing.
>>
>> On Wed, May 17, 2017 at 6:04 AM, Stefan Priebe - Profihost AG
>>  wrote:
>>> Hello Jason,
>>>
>>> attached is a rbd debug 20 log from the situation when the deadlock
>>> occours. I hope this helps.
>>>
>>> The VM was running until 2017-05-17 12:10 but there was no I/O for 10 min.
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 16.05.2017 um 22:54 schrieb Jason Dillaman:
 It looks like it's just a ping message in that capture.

 Are you saying that you restarted OSD 46 and the problem persisted?

 On Tue, May 16, 2017 at 4:02 PM, Stefan Priebe - Profihost AG
  wrote:
> Hello,
>
> while reproducing the problem, objecter_requests looks like this:
>
> {
> "ops": [
> {
> "tid": 42029,
> "pg": "5.bd9616ad",
> "osd": 46,
> "object_id": "rbd_data.e10ca56b8b4567.311c",
> "object_locator": "@5",
> "target_object_id": 
> "rbd_data.e10ca56b8b4567.311c",
> "target_object_locator": "@5",
> "paused": 0,
> "used_replica": 0,
> "precalc_pgid": 0,
> "last_sent": "2.28854e+06s",
> "attempts": 1,
> "snapid": "head",
> "snap_context": "a07c2=[]",
> "mtime": "2017-05-16 21:53:22.0.069541s",
> "osd_ops": [
> "delete"
> ]
> }
> ],
> "linger_ops": [
> {
> "linger_id": 1,
> "pg": "5.5f3bd635",
> "osd": 17,
> "object_id": "rbd_header.e10ca56b8b4567",
> "object_locator": "@5",
> "target_object_id": "rbd_header.e10ca56b8b4567",
> "target_object_locator": "@5",
> "paused": 0,
> "used_replica": 0,
> "precalc_pgid": 0,
> "snapid": "head",
> "registered": "1"
> }
> ],
> "pool_ops": [],
> "pool_stat_ops": [],
> "statfs_ops": [],
> "command_ops": []
> }
>
> Yes they've an established TCP connection. Qemu <=> osd.46. Attached is
> a pcap file of the traffic between them when it got stuck.
>
> Greets,
> Stefan
>
> Am 16.05.2017 um 21:45 schrieb Jason Dillaman:
>> On Tue, May 16, 2017 at 3:37 PM, Stefan Priebe - Profihost AG
>>  wrote:
>>> We've enabled the op tracker for performance reasons while using SSD
>>> only storage ;-(
>>
>> Disabled you mean?
>>
>>> Can enable the op tracker using ceph osd tell? Than reproduce the
>>> problem. Check what has stucked again? Or should i generate an rbd log
>>> from the client?
>>
>> From a super-quick glance at the code, it looks like that isn't a
>> dynamic setting. Of course, it's possible that if you restart OSD 46
>> to enable the op tracker, the stuck op will clear itself and the VM
>> will resume. You could attempt to generate a gcore of OSD 46 to see if
>> information on that op could be extracted via the debugger, but no
>> guarantees.
>>
>> You might want to verify that the stuck client and OSD 46 have an
>> actual established TCP connection as well before doing any further
>> actions.
>>



>>
>>
>>



-- 
Jason

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-17 Thread Stefan Priebe - Profihost AG
Hello Jason,

any idea how to debug this further? dmesg does not show any disk
failures. Smart values are also OK. There's also no xfs BUG or WARNING
from the kernel side.

I'm sure that it will work after restarting osd.46 - but i'm loosing the
ability to reproduce this in that case.

Should i install debug symbols and issue a gdb: "thread apply all bt full"?

Does it help?

Greets,
Stefan

Am 17.05.2017 um 15:12 schrieb Jason Dillaman:
> Perfect librbd log capture. I can see that a remove request to object
> rbd_data.e10ca56b8b4567.311c was issued but it never
> completed. This results in a hung discard and flush request.
> 
> Assuming that object is still mapped to OSD 46, I think there is
> either something happening with that OSD or its backing disk that is
> preventing the remove for that object from completing.
> 
> On Wed, May 17, 2017 at 6:04 AM, Stefan Priebe - Profihost AG
>  wrote:
>> Hello Jason,
>>
>> attached is a rbd debug 20 log from the situation when the deadlock
>> occours. I hope this helps.
>>
>> The VM was running until 2017-05-17 12:10 but there was no I/O for 10 min.
>>
>> Greets,
>> Stefan
>>
>> Am 16.05.2017 um 22:54 schrieb Jason Dillaman:
>>> It looks like it's just a ping message in that capture.
>>>
>>> Are you saying that you restarted OSD 46 and the problem persisted?
>>>
>>> On Tue, May 16, 2017 at 4:02 PM, Stefan Priebe - Profihost AG
>>>  wrote:
 Hello,

 while reproducing the problem, objecter_requests looks like this:

 {
 "ops": [
 {
 "tid": 42029,
 "pg": "5.bd9616ad",
 "osd": 46,
 "object_id": "rbd_data.e10ca56b8b4567.311c",
 "object_locator": "@5",
 "target_object_id": "rbd_data.e10ca56b8b4567.311c",
 "target_object_locator": "@5",
 "paused": 0,
 "used_replica": 0,
 "precalc_pgid": 0,
 "last_sent": "2.28854e+06s",
 "attempts": 1,
 "snapid": "head",
 "snap_context": "a07c2=[]",
 "mtime": "2017-05-16 21:53:22.0.069541s",
 "osd_ops": [
 "delete"
 ]
 }
 ],
 "linger_ops": [
 {
 "linger_id": 1,
 "pg": "5.5f3bd635",
 "osd": 17,
 "object_id": "rbd_header.e10ca56b8b4567",
 "object_locator": "@5",
 "target_object_id": "rbd_header.e10ca56b8b4567",
 "target_object_locator": "@5",
 "paused": 0,
 "used_replica": 0,
 "precalc_pgid": 0,
 "snapid": "head",
 "registered": "1"
 }
 ],
 "pool_ops": [],
 "pool_stat_ops": [],
 "statfs_ops": [],
 "command_ops": []
 }

 Yes they've an established TCP connection. Qemu <=> osd.46. Attached is
 a pcap file of the traffic between them when it got stuck.

 Greets,
 Stefan

 Am 16.05.2017 um 21:45 schrieb Jason Dillaman:
> On Tue, May 16, 2017 at 3:37 PM, Stefan Priebe - Profihost AG
>  wrote:
>> We've enabled the op tracker for performance reasons while using SSD
>> only storage ;-(
>
> Disabled you mean?
>
>> Can enable the op tracker using ceph osd tell? Than reproduce the
>> problem. Check what has stucked again? Or should i generate an rbd log
>> from the client?
>
> From a super-quick glance at the code, it looks like that isn't a
> dynamic setting. Of course, it's possible that if you restart OSD 46
> to enable the op tracker, the stuck op will clear itself and the VM
> will resume. You could attempt to generate a gcore of OSD 46 to see if
> information on that op could be extracted via the debugger, but no
> guarantees.
>
> You might want to verify that the stuck client and OSD 46 have an
> actual established TCP connection as well before doing any further
> actions.
>
>>>
>>>
>>>
> 
> 
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-17 Thread Jason Dillaman
Perfect librbd log capture. I can see that a remove request to object
rbd_data.e10ca56b8b4567.311c was issued but it never
completed. This results in a hung discard and flush request.

Assuming that object is still mapped to OSD 46, I think there is
either something happening with that OSD or its backing disk that is
preventing the remove for that object from completing.

On Wed, May 17, 2017 at 6:04 AM, Stefan Priebe - Profihost AG
 wrote:
> Hello Jason,
>
> attached is a rbd debug 20 log from the situation when the deadlock
> occours. I hope this helps.
>
> The VM was running until 2017-05-17 12:10 but there was no I/O for 10 min.
>
> Greets,
> Stefan
>
> Am 16.05.2017 um 22:54 schrieb Jason Dillaman:
>> It looks like it's just a ping message in that capture.
>>
>> Are you saying that you restarted OSD 46 and the problem persisted?
>>
>> On Tue, May 16, 2017 at 4:02 PM, Stefan Priebe - Profihost AG
>>  wrote:
>>> Hello,
>>>
>>> while reproducing the problem, objecter_requests looks like this:
>>>
>>> {
>>> "ops": [
>>> {
>>> "tid": 42029,
>>> "pg": "5.bd9616ad",
>>> "osd": 46,
>>> "object_id": "rbd_data.e10ca56b8b4567.311c",
>>> "object_locator": "@5",
>>> "target_object_id": "rbd_data.e10ca56b8b4567.311c",
>>> "target_object_locator": "@5",
>>> "paused": 0,
>>> "used_replica": 0,
>>> "precalc_pgid": 0,
>>> "last_sent": "2.28854e+06s",
>>> "attempts": 1,
>>> "snapid": "head",
>>> "snap_context": "a07c2=[]",
>>> "mtime": "2017-05-16 21:53:22.0.069541s",
>>> "osd_ops": [
>>> "delete"
>>> ]
>>> }
>>> ],
>>> "linger_ops": [
>>> {
>>> "linger_id": 1,
>>> "pg": "5.5f3bd635",
>>> "osd": 17,
>>> "object_id": "rbd_header.e10ca56b8b4567",
>>> "object_locator": "@5",
>>> "target_object_id": "rbd_header.e10ca56b8b4567",
>>> "target_object_locator": "@5",
>>> "paused": 0,
>>> "used_replica": 0,
>>> "precalc_pgid": 0,
>>> "snapid": "head",
>>> "registered": "1"
>>> }
>>> ],
>>> "pool_ops": [],
>>> "pool_stat_ops": [],
>>> "statfs_ops": [],
>>> "command_ops": []
>>> }
>>>
>>> Yes they've an established TCP connection. Qemu <=> osd.46. Attached is
>>> a pcap file of the traffic between them when it got stuck.
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 16.05.2017 um 21:45 schrieb Jason Dillaman:
 On Tue, May 16, 2017 at 3:37 PM, Stefan Priebe - Profihost AG
  wrote:
> We've enabled the op tracker for performance reasons while using SSD
> only storage ;-(

 Disabled you mean?

> Can enable the op tracker using ceph osd tell? Than reproduce the
> problem. Check what has stucked again? Or should i generate an rbd log
> from the client?

 From a super-quick glance at the code, it looks like that isn't a
 dynamic setting. Of course, it's possible that if you restart OSD 46
 to enable the op tracker, the stuck op will clear itself and the VM
 will resume. You could attempt to generate a gcore of OSD 46 to see if
 information on that op could be extracted via the debugger, but no
 guarantees.

 You might want to verify that the stuck client and OSD 46 have an
 actual established TCP connection as well before doing any further
 actions.

>>
>>
>>



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-16 Thread Jason Dillaman
It looks like it's just a ping message in that capture.

Are you saying that you restarted OSD 46 and the problem persisted?

On Tue, May 16, 2017 at 4:02 PM, Stefan Priebe - Profihost AG
 wrote:
> Hello,
>
> while reproducing the problem, objecter_requests looks like this:
>
> {
> "ops": [
> {
> "tid": 42029,
> "pg": "5.bd9616ad",
> "osd": 46,
> "object_id": "rbd_data.e10ca56b8b4567.311c",
> "object_locator": "@5",
> "target_object_id": "rbd_data.e10ca56b8b4567.311c",
> "target_object_locator": "@5",
> "paused": 0,
> "used_replica": 0,
> "precalc_pgid": 0,
> "last_sent": "2.28854e+06s",
> "attempts": 1,
> "snapid": "head",
> "snap_context": "a07c2=[]",
> "mtime": "2017-05-16 21:53:22.0.069541s",
> "osd_ops": [
> "delete"
> ]
> }
> ],
> "linger_ops": [
> {
> "linger_id": 1,
> "pg": "5.5f3bd635",
> "osd": 17,
> "object_id": "rbd_header.e10ca56b8b4567",
> "object_locator": "@5",
> "target_object_id": "rbd_header.e10ca56b8b4567",
> "target_object_locator": "@5",
> "paused": 0,
> "used_replica": 0,
> "precalc_pgid": 0,
> "snapid": "head",
> "registered": "1"
> }
> ],
> "pool_ops": [],
> "pool_stat_ops": [],
> "statfs_ops": [],
> "command_ops": []
> }
>
> Yes they've an established TCP connection. Qemu <=> osd.46. Attached is
> a pcap file of the traffic between them when it got stuck.
>
> Greets,
> Stefan
>
> Am 16.05.2017 um 21:45 schrieb Jason Dillaman:
>> On Tue, May 16, 2017 at 3:37 PM, Stefan Priebe - Profihost AG
>>  wrote:
>>> We've enabled the op tracker for performance reasons while using SSD
>>> only storage ;-(
>>
>> Disabled you mean?
>>
>>> Can enable the op tracker using ceph osd tell? Than reproduce the
>>> problem. Check what has stucked again? Or should i generate an rbd log
>>> from the client?
>>
>> From a super-quick glance at the code, it looks like that isn't a
>> dynamic setting. Of course, it's possible that if you restart OSD 46
>> to enable the op tracker, the stuck op will clear itself and the VM
>> will resume. You could attempt to generate a gcore of OSD 46 to see if
>> information on that op could be extracted via the debugger, but no
>> guarantees.
>>
>> You might want to verify that the stuck client and OSD 46 have an
>> actual established TCP connection as well before doing any further
>> actions.
>>



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-16 Thread Stefan Priebe - Profihost AG
Hello,

while reproducing the problem, objecter_requests looks like this:

{
"ops": [
{
"tid": 42029,
"pg": "5.bd9616ad",
"osd": 46,
"object_id": "rbd_data.e10ca56b8b4567.311c",
"object_locator": "@5",
"target_object_id": "rbd_data.e10ca56b8b4567.311c",
"target_object_locator": "@5",
"paused": 0,
"used_replica": 0,
"precalc_pgid": 0,
"last_sent": "2.28854e+06s",
"attempts": 1,
"snapid": "head",
"snap_context": "a07c2=[]",
"mtime": "2017-05-16 21:53:22.0.069541s",
"osd_ops": [
"delete"
]
}
],
"linger_ops": [
{
"linger_id": 1,
"pg": "5.5f3bd635",
"osd": 17,
"object_id": "rbd_header.e10ca56b8b4567",
"object_locator": "@5",
"target_object_id": "rbd_header.e10ca56b8b4567",
"target_object_locator": "@5",
"paused": 0,
"used_replica": 0,
"precalc_pgid": 0,
"snapid": "head",
"registered": "1"
}
],
"pool_ops": [],
"pool_stat_ops": [],
"statfs_ops": [],
"command_ops": []
}

Yes they've an established TCP connection. Qemu <=> osd.46. Attached is
a pcap file of the traffic between them when it got stuck.

Greets,
Stefan

Am 16.05.2017 um 21:45 schrieb Jason Dillaman:
> On Tue, May 16, 2017 at 3:37 PM, Stefan Priebe - Profihost AG
>  wrote:
>> We've enabled the op tracker for performance reasons while using SSD
>> only storage ;-(
> 
> Disabled you mean?
> 
>> Can enable the op tracker using ceph osd tell? Than reproduce the
>> problem. Check what has stucked again? Or should i generate an rbd log
>> from the client?
> 
> From a super-quick glance at the code, it looks like that isn't a
> dynamic setting. Of course, it's possible that if you restart OSD 46
> to enable the op tracker, the stuck op will clear itself and the VM
> will resume. You could attempt to generate a gcore of OSD 46 to see if
> information on that op could be extracted via the debugger, but no
> guarantees.
> 
> You might want to verify that the stuck client and OSD 46 have an
> actual established TCP connection as well before doing any further
> actions.
> 


osd.46_qemu_2.pcap.gz
Description: GNU Zip compressed data
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-16 Thread Jason Dillaman
On Tue, May 16, 2017 at 3:37 PM, Stefan Priebe - Profihost AG
 wrote:
> We've enabled the op tracker for performance reasons while using SSD
> only storage ;-(

Disabled you mean?

> Can enable the op tracker using ceph osd tell? Than reproduce the
> problem. Check what has stucked again? Or should i generate an rbd log
> from the client?

>From a super-quick glance at the code, it looks like that isn't a
dynamic setting. Of course, it's possible that if you restart OSD 46
to enable the op tracker, the stuck op will clear itself and the VM
will resume. You could attempt to generate a gcore of OSD 46 to see if
information on that op could be extracted via the debugger, but no
guarantees.

You might want to verify that the stuck client and OSD 46 have an
actual established TCP connection as well before doing any further
actions.

-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-16 Thread Stefan Priebe - Profihost AG
Hello Jason,

Am 16.05.2017 um 21:32 schrieb Jason Dillaman:
> Thanks for the update. In the ops dump provided, the objecter is
> saying that OSD 46 hasn't responded to the deletion request of object
> rbd_data.e10ca56b8b4567.311c.
> 
> Perhaps run "ceph daemon osd.46 dump_ops_in_flight" or "...
> dump_historic_ops" to see if that op is in the list?

We've enabled the op tracker for performance reasons while using SSD
only storage ;-(

Can enable the op tracker using ceph osd tell? Than reproduce the
problem. Check what has stucked again? Or should i generate an rbd log
from the client?

> You can also run
> "ceph osd map  rbd_data.e10ca56b8b4567.311c" to
> verify that OSD 46 is the primary PG for that object.

Yes it is:
osdmap e886758 pool 'cephstor1' (5) object
'rbd_data.e10ca56b8b4567.311c' -> pg 5.bd9616ad (5.6ad) ->
up ([46,29,30], p46) acting ([46,29,30], p46)

Greets,
Stefan

> On Tue, May 16, 2017 at 3:14 PM, Stefan Priebe - Profihost AG
>  wrote:
>> Hello Jason,
>>
>> i'm happy to tell you that i've currently one VM where i can reproduce
>> the problem.
>>
>>> The best option would be to run "gcore" against the running VM whose
>>> IO is stuck, compress the dump, and use the "ceph-post-file" to
>>> provide the dump. I could then look at all the Ceph data structures to
>>> hopefully find the issue.
>>
>> I've saved the dump but it will contain sensitive informations. I won't
>> upload it to a public server. I'll send you an private email with a
>> private server to download the core dump. Thanks!
>>
>>> Enabling debug logs after the IO has stuck will most likely be of
>>> little value since it won't include the details of which IOs are
>>> outstanding. You could attempt to use "ceph --admin-daemon
>>> /path/to/stuck/vm/asok objecter_requests" to see if any IOs are just
>>> stuck waiting on an OSD to respond.
>>
>> This is the output:
>> # ceph --admin-daemon
>> /var/run/ceph/ceph-client.admin.5295.140214539927552.asok objecter_requests
>> {
>> "ops": [
>> {
>> "tid": 384632,
>> "pg": "5.bd9616ad",
>> "osd": 46,
>> "object_id": "rbd_data.e10ca56b8b4567.311c",
>> "object_locator": "@5",
>> "target_object_id": "rbd_data.e10ca56b8b4567.311c",
>> "target_object_locator": "@5",
>> "paused": 0,
>> "used_replica": 0,
>> "precalc_pgid": 0,
>> "last_sent": "2.28554e+06s",
>> "attempts": 1,
>> "snapid": "head",
>> "snap_context": "a07c2=[]",
>> "mtime": "2017-05-16 21:03:22.0.196102s",
>> "osd_ops": [
>> "delete"
>> ]
>> }
>> ],
>> "linger_ops": [
>> {
>> "linger_id": 1,
>> "pg": "5.5f3bd635",
>> "osd": 17,
>> "object_id": "rbd_header.e10ca56b8b4567",
>> "object_locator": "@5",
>> "target_object_id": "rbd_header.e10ca56b8b4567",
>> "target_object_locator": "@5",
>> "paused": 0,
>> "used_replica": 0,
>> "precalc_pgid": 0,
>> "snapid": "head",
>> "registered": "1"
>> }
>> ],
>> "pool_ops": [],
>> "pool_stat_ops": [],
>> "statfs_ops": [],
>> "command_ops": []
>> }
>>
>> Greets,
>> Stefan
>>
>> Am 16.05.2017 um 15:44 schrieb Jason Dillaman:
>>> On Tue, May 16, 2017 at 2:12 AM, Stefan Priebe - Profihost AG
>>>  wrote:
 3.) it still happens on pre jewel images even when they got restarted /
 killed and reinitialized. In that case they've the asok socket available
 for now. Should i issue any command to the socket to get log out of the
 hanging vm? Qemu is still responding just ceph / disk i/O gets stalled.
>>>
>>> The best option would be to run "gcore" against the running VM whose
>>> IO is stuck, compress the dump, and use the "ceph-post-file" to
>>> provide the dump. I could then look at all the Ceph data structures to
>>> hopefully find the issue.
>>>
>>> Enabling debug logs after the IO has stuck will most likely be of
>>> little value since it won't include the details of which IOs are
>>> outstanding. You could attempt to use "ceph --admin-daemon
>>> /path/to/stuck/vm/asok objecter_requests" to see if any IOs are just
>>> stuck waiting on an OSD to respond.
>>>
> 
> 
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-16 Thread Stefan Priebe - Profihost AG
Hello Jason,

i'm happy to tell you that i've currently one VM where i can reproduce
the problem.

> The best option would be to run "gcore" against the running VM whose
> IO is stuck, compress the dump, and use the "ceph-post-file" to
> provide the dump. I could then look at all the Ceph data structures to
> hopefully find the issue.

I've saved the dump but it will contain sensitive informations. I won't
upload it to a public server. I'll send you an private email with a
private server to download the core dump. Thanks!

> Enabling debug logs after the IO has stuck will most likely be of
> little value since it won't include the details of which IOs are
> outstanding. You could attempt to use "ceph --admin-daemon
> /path/to/stuck/vm/asok objecter_requests" to see if any IOs are just
> stuck waiting on an OSD to respond.

This is the output:
# ceph --admin-daemon
/var/run/ceph/ceph-client.admin.5295.140214539927552.asok objecter_requests
{
"ops": [
{
"tid": 384632,
"pg": "5.bd9616ad",
"osd": 46,
"object_id": "rbd_data.e10ca56b8b4567.311c",
"object_locator": "@5",
"target_object_id": "rbd_data.e10ca56b8b4567.311c",
"target_object_locator": "@5",
"paused": 0,
"used_replica": 0,
"precalc_pgid": 0,
"last_sent": "2.28554e+06s",
"attempts": 1,
"snapid": "head",
"snap_context": "a07c2=[]",
"mtime": "2017-05-16 21:03:22.0.196102s",
"osd_ops": [
"delete"
]
}
],
"linger_ops": [
{
"linger_id": 1,
"pg": "5.5f3bd635",
"osd": 17,
"object_id": "rbd_header.e10ca56b8b4567",
"object_locator": "@5",
"target_object_id": "rbd_header.e10ca56b8b4567",
"target_object_locator": "@5",
"paused": 0,
"used_replica": 0,
"precalc_pgid": 0,
"snapid": "head",
"registered": "1"
}
],
"pool_ops": [],
"pool_stat_ops": [],
"statfs_ops": [],
"command_ops": []
}

Greets,
Stefan

Am 16.05.2017 um 15:44 schrieb Jason Dillaman:
> On Tue, May 16, 2017 at 2:12 AM, Stefan Priebe - Profihost AG
>  wrote:
>> 3.) it still happens on pre jewel images even when they got restarted /
>> killed and reinitialized. In that case they've the asok socket available
>> for now. Should i issue any command to the socket to get log out of the
>> hanging vm? Qemu is still responding just ceph / disk i/O gets stalled.
> 
> The best option would be to run "gcore" against the running VM whose
> IO is stuck, compress the dump, and use the "ceph-post-file" to
> provide the dump. I could then look at all the Ceph data structures to
> hopefully find the issue.
> 
> Enabling debug logs after the IO has stuck will most likely be of
> little value since it won't include the details of which IOs are
> outstanding. You could attempt to use "ceph --admin-daemon
> /path/to/stuck/vm/asok objecter_requests" to see if any IOs are just
> stuck waiting on an OSD to respond.
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-16 Thread Jason Dillaman
On Tue, May 16, 2017 at 2:12 AM, Stefan Priebe - Profihost AG
 wrote:
> 3.) it still happens on pre jewel images even when they got restarted /
> killed and reinitialized. In that case they've the asok socket available
> for now. Should i issue any command to the socket to get log out of the
> hanging vm? Qemu is still responding just ceph / disk i/O gets stalled.

The best option would be to run "gcore" against the running VM whose
IO is stuck, compress the dump, and use the "ceph-post-file" to
provide the dump. I could then look at all the Ceph data structures to
hopefully find the issue.

Enabling debug logs after the IO has stuck will most likely be of
little value since it won't include the details of which IOs are
outstanding. You could attempt to use "ceph --admin-daemon
/path/to/stuck/vm/asok objecter_requests" to see if any IOs are just
stuck waiting on an OSD to respond.

-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-16 Thread Stefan Priebe - Profihost AG

> 3.) it still happens on pre jewel images even when they got restarted
> / killed and reinitialized. In that case they've the asok socket
> available
> for now. Should i issue any command to the socket to get log out of
> the  hanging vm? Qemu is still responding just ceph / disk i/O gets
> stalled.

This meant to be restarting qemu.

Stefan


Am 16.05.2017 um 08:12 schrieb Stefan Priebe - Profihost AG:
> Hello Jason,
> 
> it got some further hints. Please see below.
> 
> Am 15.05.2017 um 22:25 schrieb Jason Dillaman:
>> On Mon, May 15, 2017 at 3:54 PM, Stefan Priebe - Profihost AG
>>  wrote:
>>> Would it be possible that the problem is the same you fixed?
>>
>> No, I would not expect it to be related to the other issues you are
>> seeing. The issue I just posted a fix against only occurs when a
>> client requests the lock from the current owner, which will only occur
>> under the following scenarios: (1) attempt to write to the image
>> locked by another client, (2) attempt to disable image features on an
>> image locked by another client, (3) demote a primary mirrored image
>> when locked by another client, or (4) the rbd CLI attempted to perform
>> an operation not supported by the currently running lock owner client
>> due to version mismatch.
> 
> ah OK. Mhm nothing i would expect.
> 
>> I am assuming you are not running two VMs concurrently using the same
>> backing RBD image, so that would eliminate possibility (1).
> 
> No i do not.
> 
> I investigated a lot of time in analyzing the log files. What i can tell
> so far is:
> 
> 1.) it happens very often, when we issue a fstrim command on the root
> device of a vm. We're using Qemu virtio-scsi backend with:
> 
> cache=writeback,aio=threads,detect-zeroes=unmap,discard=on
> 
> 2.) but it also happens on other unknown "operations" - at least fstrim
> seems to trigger it at best
> 
> 3.) it happens once or twice a night while doing around 1500-2000
> backups. So it looks like a race to me.
> 
> 3.) it still happens on pre jewel images even when they got restarted /
> killed and reinitialized. In that case they've the asok socket available
> for now. Should i issue any command to the socket to get log out of the
> hanging vm? Qemu is still responding just ceph / disk i/O gets stalled.
> 
> Greets,
> Stefan
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-16 Thread Stefan Priebe - Profihost AG
Hello Jason,

it got some further hints. Please see below.

Am 15.05.2017 um 22:25 schrieb Jason Dillaman:
> On Mon, May 15, 2017 at 3:54 PM, Stefan Priebe - Profihost AG
>  wrote:
>> Would it be possible that the problem is the same you fixed?
> 
> No, I would not expect it to be related to the other issues you are
> seeing. The issue I just posted a fix against only occurs when a
> client requests the lock from the current owner, which will only occur
> under the following scenarios: (1) attempt to write to the image
> locked by another client, (2) attempt to disable image features on an
> image locked by another client, (3) demote a primary mirrored image
> when locked by another client, or (4) the rbd CLI attempted to perform
> an operation not supported by the currently running lock owner client
> due to version mismatch.

ah OK. Mhm nothing i would expect.

> I am assuming you are not running two VMs concurrently using the same
> backing RBD image, so that would eliminate possibility (1).

No i do not.

I investigated a lot of time in analyzing the log files. What i can tell
so far is:

1.) it happens very often, when we issue a fstrim command on the root
device of a vm. We're using Qemu virtio-scsi backend with:

cache=writeback,aio=threads,detect-zeroes=unmap,discard=on

2.) but it also happens on other unknown "operations" - at least fstrim
seems to trigger it at best

3.) it happens once or twice a night while doing around 1500-2000
backups. So it looks like a race to me.

3.) it still happens on pre jewel images even when they got restarted /
killed and reinitialized. In that case they've the asok socket available
for now. Should i issue any command to the socket to get log out of the
hanging vm? Qemu is still responding just ceph / disk i/O gets stalled.

Greets,
Stefan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-15 Thread Jason Dillaman
On Mon, May 15, 2017 at 3:54 PM, Stefan Priebe - Profihost AG
 wrote:
> Would it be possible that the problem is the same you fixed?

No, I would not expect it to be related to the other issues you are
seeing. The issue I just posted a fix against only occurs when a
client requests the lock from the current owner, which will only occur
under the following scenarios: (1) attempt to write to the image
locked by another client, (2) attempt to disable image features on an
image locked by another client, (3) demote a primary mirrored image
when locked by another client, or (4) the rbd CLI attempted to perform
an operation not supported by the currently running lock owner client
due to version mismatch.

I am assuming you are not running two VMs concurrently using the same
backing RBD image, so that would eliminate possibility (1).


-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-15 Thread Stefan Priebe - Profihost AG
Hi,

great thanks.

I'm still trying but it's difficult to me as well. As it happens only
sometimes there must be an unknown additional factor. For the future
i've enabled client sockets for all VMs as well. But this does not help
in this case - as it seems to be fixed after migration.

Would it be possible that the problem is the same you fixed? Could it be
that the export process holds the lock while the client tries to
reaquire it for writing?

Greets,
Stefan
Am 15.05.2017 um 21:19 schrieb Jason Dillaman:
> I was able to re-create the issue where "rbd feature disable" hangs if
> the client experienced a long comms failure with the OSDs, and I have
> a proposed fix posted [1]. Unfortunately, I haven't been successful in
> repeating any stalled IO, discard issues, nor export-diff logged
> errors. I'll keep trying to reproduce, but if you can generate
> debug-level logging from one of these events it would be greatly
> appreciated.
> 
> [1] https://github.com/ceph/ceph/pull/15093
> 
> On Mon, May 15, 2017 at 1:29 PM, Stefan Priebe - Profihost AG
>  wrote:
>> Hello Jason,
>>> Just so I can attempt to repeat this:
>>
>> Thanks.
>>
>>> (1) you had an image that was built using Hammer clients and OSDs with
>>> exclusive lock disabled
>> Yes. It was created with the hammer rbd defaults.
>>
>>> (2) you updated your clients and OSDs to Jewel
>>> (3) you restarted your OSDs and live-migrated your VMs to pick up the
>>> Jewel changes
>>
>> No. I updated the clients only and did a live migration for all VMs to
>> load up the jewel librbd.
>>
>> After that i updated the mons + restart and than updated the osds + restart.
>>
>>> (4) you enabled exclusive-lock, object-map, and fast-diff on a running VM
>> Yes.
>>
>>> (5) you rebuilt the image's object map (while the VM was running?)
>> Yes.
>>
>>> (6) things started breaking at this point
>> Yes but not on all VMs and only while creating and deleting snapshots.
>>
>> Greets,
>> Stefan
>>
>>
>>>
>>> On Sun, May 14, 2017 at 1:42 PM, Stefan Priebe - Profihost AG
>>>  wrote:
 I verified it. After a live migration of the VM i'm able to successfully
 disable fast-diff,exclusive-lock,object-map.

 The problem only seems to occur at all if a client has connected to
 hammer without exclusive lock. Than got upgraded to jewel and exclusive
 lock gets enabled.

 Greets,
 Stefan

 Am 14.05.2017 um 19:33 schrieb Stefan Priebe - Profihost AG:
> Hello Jason,
>
> Am 14.05.2017 um 14:04 schrieb Jason Dillaman:
>> It appears as though there is client.27994090 at 10.255.0.13 that
>> currently owns the exclusive lock on that image. I am assuming the log
>> is from "rbd feature disable"?
> Yes.
>
>> If so, I can see that it attempts to
>> acquire the lock and the other side is not appropriately responding to
>> the request.
>>
>> Assuming your system is still in this state, is there any chance to
>> get debug rbd=20 logs from that client by using the client's asok file
>> and "ceph --admin-daemon /path/to/client/asok config set debug_rbd 20"
>> and re-run the attempt to disable exclusive lock?
>
> It's a VM running qemu with librbd. It seems there is no default socket.
> If there is no way to activate it later - i don't think so. I can try to
> activate it in ceph.conf and migrate it to another node. But i'm not
> sure whether the problem persist after migration or if librbd is
> somewhat like reinitialized.
>
>> Also, what version of Ceph is that client running?
> Client and Server are on ceph 10.2.7.
>
> Greets,
> Stefan
>
>> Jason
>>
>> On Sun, May 14, 2017 at 1:55 AM, Stefan Priebe - Profihost AG
>>  wrote:
>>> Hello Jason,
>>>
>>> as it still happens and VMs are crashing. I wanted to disable
>>> exclusive-lock,fast-diff again. But i detected that there are images
>>> where the rbd commands runs in an endless loop.
>>>
>>> I canceled the command after 60s and used --debug-rbd=20. Will send the
>>> log off list.
>>>
>>> Thanks!
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 13.05.2017 um 19:19 schrieb Stefan Priebe - Profihost AG:
 Hello Jason,

 it seems to be related to fstrim and discard. I cannot reproduce it for
 images were we don't use trim - but it's still the case it's working
 fine for images created with jewel and it is not for images pre jewel.
 The only difference i can find is that the images created with jewel
 also support deep-flatten.

 Greets,
 Stefan

 Am 11.05.2017 um 22:28 schrieb Jason Dillaman:
> Assuming the only log messages you are seeing are the following:
>
> 2017-05-06 03:20:50.830626 7f7876a64700 -1
> librbd::object_map::InvalidateRequest: 

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-15 Thread Jason Dillaman
I was able to re-create the issue where "rbd feature disable" hangs if
the client experienced a long comms failure with the OSDs, and I have
a proposed fix posted [1]. Unfortunately, I haven't been successful in
repeating any stalled IO, discard issues, nor export-diff logged
errors. I'll keep trying to reproduce, but if you can generate
debug-level logging from one of these events it would be greatly
appreciated.

[1] https://github.com/ceph/ceph/pull/15093

On Mon, May 15, 2017 at 1:29 PM, Stefan Priebe - Profihost AG
 wrote:
> Hello Jason,
>> Just so I can attempt to repeat this:
>
> Thanks.
>
>> (1) you had an image that was built using Hammer clients and OSDs with
>> exclusive lock disabled
> Yes. It was created with the hammer rbd defaults.
>
>> (2) you updated your clients and OSDs to Jewel
>> (3) you restarted your OSDs and live-migrated your VMs to pick up the
>> Jewel changes
>
> No. I updated the clients only and did a live migration for all VMs to
> load up the jewel librbd.
>
> After that i updated the mons + restart and than updated the osds + restart.
>
>> (4) you enabled exclusive-lock, object-map, and fast-diff on a running VM
> Yes.
>
>> (5) you rebuilt the image's object map (while the VM was running?)
> Yes.
>
>> (6) things started breaking at this point
> Yes but not on all VMs and only while creating and deleting snapshots.
>
> Greets,
> Stefan
>
>
>>
>> On Sun, May 14, 2017 at 1:42 PM, Stefan Priebe - Profihost AG
>>  wrote:
>>> I verified it. After a live migration of the VM i'm able to successfully
>>> disable fast-diff,exclusive-lock,object-map.
>>>
>>> The problem only seems to occur at all if a client has connected to
>>> hammer without exclusive lock. Than got upgraded to jewel and exclusive
>>> lock gets enabled.
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 14.05.2017 um 19:33 schrieb Stefan Priebe - Profihost AG:
 Hello Jason,

 Am 14.05.2017 um 14:04 schrieb Jason Dillaman:
> It appears as though there is client.27994090 at 10.255.0.13 that
> currently owns the exclusive lock on that image. I am assuming the log
> is from "rbd feature disable"?
 Yes.

> If so, I can see that it attempts to
> acquire the lock and the other side is not appropriately responding to
> the request.
>
> Assuming your system is still in this state, is there any chance to
> get debug rbd=20 logs from that client by using the client's asok file
> and "ceph --admin-daemon /path/to/client/asok config set debug_rbd 20"
> and re-run the attempt to disable exclusive lock?

 It's a VM running qemu with librbd. It seems there is no default socket.
 If there is no way to activate it later - i don't think so. I can try to
 activate it in ceph.conf and migrate it to another node. But i'm not
 sure whether the problem persist after migration or if librbd is
 somewhat like reinitialized.

> Also, what version of Ceph is that client running?
 Client and Server are on ceph 10.2.7.

 Greets,
 Stefan

> Jason
>
> On Sun, May 14, 2017 at 1:55 AM, Stefan Priebe - Profihost AG
>  wrote:
>> Hello Jason,
>>
>> as it still happens and VMs are crashing. I wanted to disable
>> exclusive-lock,fast-diff again. But i detected that there are images
>> where the rbd commands runs in an endless loop.
>>
>> I canceled the command after 60s and used --debug-rbd=20. Will send the
>> log off list.
>>
>> Thanks!
>>
>> Greets,
>> Stefan
>>
>> Am 13.05.2017 um 19:19 schrieb Stefan Priebe - Profihost AG:
>>> Hello Jason,
>>>
>>> it seems to be related to fstrim and discard. I cannot reproduce it for
>>> images were we don't use trim - but it's still the case it's working
>>> fine for images created with jewel and it is not for images pre jewel.
>>> The only difference i can find is that the images created with jewel
>>> also support deep-flatten.
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 11.05.2017 um 22:28 schrieb Jason Dillaman:
 Assuming the only log messages you are seeing are the following:

 2017-05-06 03:20:50.830626 7f7876a64700 -1
 librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
 object map in-memory
 2017-05-06 03:20:50.830634 7f7876a64700 -1
 librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
 object map on-disk
 2017-05-06 03:20:50.831250 7f7877265700 -1
 librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: 
 r=0

 It looks like that can only occur if somehow the object-map on disk is
 larger than the actual image size. If that's the case, how the image
 got into that state is unknown to me at this point.

 On Thu, May 11, 2017 at 3:23 PM, Stefan 

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-15 Thread Stefan Priebe - Profihost AG
Hello Jason,
> Just so I can attempt to repeat this:

Thanks.

> (1) you had an image that was built using Hammer clients and OSDs with
> exclusive lock disabled
Yes. It was created with the hammer rbd defaults.

> (2) you updated your clients and OSDs to Jewel
> (3) you restarted your OSDs and live-migrated your VMs to pick up the
> Jewel changes

No. I updated the clients only and did a live migration for all VMs to
load up the jewel librbd.

After that i updated the mons + restart and than updated the osds + restart.

> (4) you enabled exclusive-lock, object-map, and fast-diff on a running VM
Yes.

> (5) you rebuilt the image's object map (while the VM was running?)
Yes.

> (6) things started breaking at this point
Yes but not on all VMs and only while creating and deleting snapshots.

Greets,
Stefan


> 
> On Sun, May 14, 2017 at 1:42 PM, Stefan Priebe - Profihost AG
>  wrote:
>> I verified it. After a live migration of the VM i'm able to successfully
>> disable fast-diff,exclusive-lock,object-map.
>>
>> The problem only seems to occur at all if a client has connected to
>> hammer without exclusive lock. Than got upgraded to jewel and exclusive
>> lock gets enabled.
>>
>> Greets,
>> Stefan
>>
>> Am 14.05.2017 um 19:33 schrieb Stefan Priebe - Profihost AG:
>>> Hello Jason,
>>>
>>> Am 14.05.2017 um 14:04 schrieb Jason Dillaman:
 It appears as though there is client.27994090 at 10.255.0.13 that
 currently owns the exclusive lock on that image. I am assuming the log
 is from "rbd feature disable"?
>>> Yes.
>>>
 If so, I can see that it attempts to
 acquire the lock and the other side is not appropriately responding to
 the request.

 Assuming your system is still in this state, is there any chance to
 get debug rbd=20 logs from that client by using the client's asok file
 and "ceph --admin-daemon /path/to/client/asok config set debug_rbd 20"
 and re-run the attempt to disable exclusive lock?
>>>
>>> It's a VM running qemu with librbd. It seems there is no default socket.
>>> If there is no way to activate it later - i don't think so. I can try to
>>> activate it in ceph.conf and migrate it to another node. But i'm not
>>> sure whether the problem persist after migration or if librbd is
>>> somewhat like reinitialized.
>>>
 Also, what version of Ceph is that client running?
>>> Client and Server are on ceph 10.2.7.
>>>
>>> Greets,
>>> Stefan
>>>
 Jason

 On Sun, May 14, 2017 at 1:55 AM, Stefan Priebe - Profihost AG
  wrote:
> Hello Jason,
>
> as it still happens and VMs are crashing. I wanted to disable
> exclusive-lock,fast-diff again. But i detected that there are images
> where the rbd commands runs in an endless loop.
>
> I canceled the command after 60s and used --debug-rbd=20. Will send the
> log off list.
>
> Thanks!
>
> Greets,
> Stefan
>
> Am 13.05.2017 um 19:19 schrieb Stefan Priebe - Profihost AG:
>> Hello Jason,
>>
>> it seems to be related to fstrim and discard. I cannot reproduce it for
>> images were we don't use trim - but it's still the case it's working
>> fine for images created with jewel and it is not for images pre jewel.
>> The only difference i can find is that the images created with jewel
>> also support deep-flatten.
>>
>> Greets,
>> Stefan
>>
>> Am 11.05.2017 um 22:28 schrieb Jason Dillaman:
>>> Assuming the only log messages you are seeing are the following:
>>>
>>> 2017-05-06 03:20:50.830626 7f7876a64700 -1
>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>>> object map in-memory
>>> 2017-05-06 03:20:50.830634 7f7876a64700 -1
>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>>> object map on-disk
>>> 2017-05-06 03:20:50.831250 7f7877265700 -1
>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: 
>>> r=0
>>>
>>> It looks like that can only occur if somehow the object-map on disk is
>>> larger than the actual image size. If that's the case, how the image
>>> got into that state is unknown to me at this point.
>>>
>>> On Thu, May 11, 2017 at 3:23 PM, Stefan Priebe - Profihost AG
>>>  wrote:
 Hi Jason,

 it seems i can at least circumvent the crashes. Since i restarted ALL
 osds after enabling exclusive lock and rebuilding the object maps it 
 had
 no new crashes.

 What still makes me wonder are those
 librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: 
 r=0

 messages.

 Greets,
 Stefan

 Am 08.05.2017 um 14:50 schrieb Stefan Priebe - Profihost AG:
> Hi,
> Am 08.05.2017 um 14:40 schrieb Jason Dillaman:
>> You are saying that you 

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-15 Thread Jason Dillaman
Just so I can attempt to repeat this:

(1) you had an image that was built using Hammer clients and OSDs with
exclusive lock disabled
(2) you updated your clients and OSDs to Jewel
(3) you restarted your OSDs and live-migrated your VMs to pick up the
Jewel changes
(4) you enabled exclusive-lock, object-map, and fast-diff on a running VM
(5) you rebuilt the image's object map (while the VM was running?)
(6) things started breaking at this point

Thanks.


On Sun, May 14, 2017 at 1:42 PM, Stefan Priebe - Profihost AG
 wrote:
> I verified it. After a live migration of the VM i'm able to successfully
> disable fast-diff,exclusive-lock,object-map.
>
> The problem only seems to occur at all if a client has connected to
> hammer without exclusive lock. Than got upgraded to jewel and exclusive
> lock gets enabled.
>
> Greets,
> Stefan
>
> Am 14.05.2017 um 19:33 schrieb Stefan Priebe - Profihost AG:
>> Hello Jason,
>>
>> Am 14.05.2017 um 14:04 schrieb Jason Dillaman:
>>> It appears as though there is client.27994090 at 10.255.0.13 that
>>> currently owns the exclusive lock on that image. I am assuming the log
>>> is from "rbd feature disable"?
>> Yes.
>>
>>> If so, I can see that it attempts to
>>> acquire the lock and the other side is not appropriately responding to
>>> the request.
>>>
>>> Assuming your system is still in this state, is there any chance to
>>> get debug rbd=20 logs from that client by using the client's asok file
>>> and "ceph --admin-daemon /path/to/client/asok config set debug_rbd 20"
>>> and re-run the attempt to disable exclusive lock?
>>
>> It's a VM running qemu with librbd. It seems there is no default socket.
>> If there is no way to activate it later - i don't think so. I can try to
>> activate it in ceph.conf and migrate it to another node. But i'm not
>> sure whether the problem persist after migration or if librbd is
>> somewhat like reinitialized.
>>
>>> Also, what version of Ceph is that client running?
>> Client and Server are on ceph 10.2.7.
>>
>> Greets,
>> Stefan
>>
>>> Jason
>>>
>>> On Sun, May 14, 2017 at 1:55 AM, Stefan Priebe - Profihost AG
>>>  wrote:
 Hello Jason,

 as it still happens and VMs are crashing. I wanted to disable
 exclusive-lock,fast-diff again. But i detected that there are images
 where the rbd commands runs in an endless loop.

 I canceled the command after 60s and used --debug-rbd=20. Will send the
 log off list.

 Thanks!

 Greets,
 Stefan

 Am 13.05.2017 um 19:19 schrieb Stefan Priebe - Profihost AG:
> Hello Jason,
>
> it seems to be related to fstrim and discard. I cannot reproduce it for
> images were we don't use trim - but it's still the case it's working
> fine for images created with jewel and it is not for images pre jewel.
> The only difference i can find is that the images created with jewel
> also support deep-flatten.
>
> Greets,
> Stefan
>
> Am 11.05.2017 um 22:28 schrieb Jason Dillaman:
>> Assuming the only log messages you are seeing are the following:
>>
>> 2017-05-06 03:20:50.830626 7f7876a64700 -1
>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>> object map in-memory
>> 2017-05-06 03:20:50.830634 7f7876a64700 -1
>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>> object map on-disk
>> 2017-05-06 03:20:50.831250 7f7877265700 -1
>> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: 
>> r=0
>>
>> It looks like that can only occur if somehow the object-map on disk is
>> larger than the actual image size. If that's the case, how the image
>> got into that state is unknown to me at this point.
>>
>> On Thu, May 11, 2017 at 3:23 PM, Stefan Priebe - Profihost AG
>>  wrote:
>>> Hi Jason,
>>>
>>> it seems i can at least circumvent the crashes. Since i restarted ALL
>>> osds after enabling exclusive lock and rebuilding the object maps it had
>>> no new crashes.
>>>
>>> What still makes me wonder are those
>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: 
>>> r=0
>>>
>>> messages.
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 08.05.2017 um 14:50 schrieb Stefan Priebe - Profihost AG:
 Hi,
 Am 08.05.2017 um 14:40 schrieb Jason Dillaman:
> You are saying that you had v2 RBD images created against Hammer OSDs
> and client libraries where exclusive lock, object map, etc were never
> enabled. You then upgraded the OSDs and clients to Jewel and at some
> point enabled exclusive lock (and I'd assume object map) on these
> images

 Yes i did:
 for img in $(rbd -p cephstor5 ls -l | grep -v "@" | awk '{ print $1 
 }');
 do rbd -p cephstor5 feature enable $img
 

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-14 Thread Stefan Priebe - Profihost AG
I verified it. After a live migration of the VM i'm able to successfully
disable fast-diff,exclusive-lock,object-map.

The problem only seems to occur at all if a client has connected to
hammer without exclusive lock. Than got upgraded to jewel and exclusive
lock gets enabled.

Greets,
Stefan

Am 14.05.2017 um 19:33 schrieb Stefan Priebe - Profihost AG:
> Hello Jason,
> 
> Am 14.05.2017 um 14:04 schrieb Jason Dillaman:
>> It appears as though there is client.27994090 at 10.255.0.13 that
>> currently owns the exclusive lock on that image. I am assuming the log
>> is from "rbd feature disable"?
> Yes.
> 
>> If so, I can see that it attempts to
>> acquire the lock and the other side is not appropriately responding to
>> the request.
>>
>> Assuming your system is still in this state, is there any chance to
>> get debug rbd=20 logs from that client by using the client's asok file
>> and "ceph --admin-daemon /path/to/client/asok config set debug_rbd 20"
>> and re-run the attempt to disable exclusive lock?
> 
> It's a VM running qemu with librbd. It seems there is no default socket.
> If there is no way to activate it later - i don't think so. I can try to
> activate it in ceph.conf and migrate it to another node. But i'm not
> sure whether the problem persist after migration or if librbd is
> somewhat like reinitialized.
> 
>> Also, what version of Ceph is that client running?
> Client and Server are on ceph 10.2.7.
> 
> Greets,
> Stefan
> 
>> Jason
>>
>> On Sun, May 14, 2017 at 1:55 AM, Stefan Priebe - Profihost AG
>>  wrote:
>>> Hello Jason,
>>>
>>> as it still happens and VMs are crashing. I wanted to disable
>>> exclusive-lock,fast-diff again. But i detected that there are images
>>> where the rbd commands runs in an endless loop.
>>>
>>> I canceled the command after 60s and used --debug-rbd=20. Will send the
>>> log off list.
>>>
>>> Thanks!
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 13.05.2017 um 19:19 schrieb Stefan Priebe - Profihost AG:
 Hello Jason,

 it seems to be related to fstrim and discard. I cannot reproduce it for
 images were we don't use trim - but it's still the case it's working
 fine for images created with jewel and it is not for images pre jewel.
 The only difference i can find is that the images created with jewel
 also support deep-flatten.

 Greets,
 Stefan

 Am 11.05.2017 um 22:28 schrieb Jason Dillaman:
> Assuming the only log messages you are seeing are the following:
>
> 2017-05-06 03:20:50.830626 7f7876a64700 -1
> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
> object map in-memory
> 2017-05-06 03:20:50.830634 7f7876a64700 -1
> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
> object map on-disk
> 2017-05-06 03:20:50.831250 7f7877265700 -1
> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0
>
> It looks like that can only occur if somehow the object-map on disk is
> larger than the actual image size. If that's the case, how the image
> got into that state is unknown to me at this point.
>
> On Thu, May 11, 2017 at 3:23 PM, Stefan Priebe - Profihost AG
>  wrote:
>> Hi Jason,
>>
>> it seems i can at least circumvent the crashes. Since i restarted ALL
>> osds after enabling exclusive lock and rebuilding the object maps it had
>> no new crashes.
>>
>> What still makes me wonder are those
>> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: 
>> r=0
>>
>> messages.
>>
>> Greets,
>> Stefan
>>
>> Am 08.05.2017 um 14:50 schrieb Stefan Priebe - Profihost AG:
>>> Hi,
>>> Am 08.05.2017 um 14:40 schrieb Jason Dillaman:
 You are saying that you had v2 RBD images created against Hammer OSDs
 and client libraries where exclusive lock, object map, etc were never
 enabled. You then upgraded the OSDs and clients to Jewel and at some
 point enabled exclusive lock (and I'd assume object map) on these
 images
>>>
>>> Yes i did:
>>> for img in $(rbd -p cephstor5 ls -l | grep -v "@" | awk '{ print $1 }');
>>> do rbd -p cephstor5 feature enable $img
>>> exclusive-lock,object-map,fast-diff || echo $img; done
>>>
 -- or were the exclusive lock and object map features already
 enabled under Hammer?
>>>
>>> No as they were not the rbd defaults.
>>>
 The fact that you encountered an object map error on an export
 operation is surprising to me.  Does that error re-occur if you
 perform the export again? If you can repeat it, it would be very
 helpful if you could run the export with "--debug-rbd=20" and capture
 the generated logs.
>>>
>>> No i can't repeat it. It happens every night but for different images.
>>> But i never saw it for a vm twice. If i do he 

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-14 Thread Stefan Priebe - Profihost AG
Hello Jason,

Am 14.05.2017 um 14:04 schrieb Jason Dillaman:
> It appears as though there is client.27994090 at 10.255.0.13 that
> currently owns the exclusive lock on that image. I am assuming the log
> is from "rbd feature disable"?
Yes.

> If so, I can see that it attempts to
> acquire the lock and the other side is not appropriately responding to
> the request.
> 
> Assuming your system is still in this state, is there any chance to
> get debug rbd=20 logs from that client by using the client's asok file
> and "ceph --admin-daemon /path/to/client/asok config set debug_rbd 20"
> and re-run the attempt to disable exclusive lock?

It's a VM running qemu with librbd. It seems there is no default socket.
If there is no way to activate it later - i don't think so. I can try to
activate it in ceph.conf and migrate it to another node. But i'm not
sure whether the problem persist after migration or if librbd is
somewhat like reinitialized.

> Also, what version of Ceph is that client running?
Client and Server are on ceph 10.2.7.

Greets,
Stefan

> Jason
> 
> On Sun, May 14, 2017 at 1:55 AM, Stefan Priebe - Profihost AG
>  wrote:
>> Hello Jason,
>>
>> as it still happens and VMs are crashing. I wanted to disable
>> exclusive-lock,fast-diff again. But i detected that there are images
>> where the rbd commands runs in an endless loop.
>>
>> I canceled the command after 60s and used --debug-rbd=20. Will send the
>> log off list.
>>
>> Thanks!
>>
>> Greets,
>> Stefan
>>
>> Am 13.05.2017 um 19:19 schrieb Stefan Priebe - Profihost AG:
>>> Hello Jason,
>>>
>>> it seems to be related to fstrim and discard. I cannot reproduce it for
>>> images were we don't use trim - but it's still the case it's working
>>> fine for images created with jewel and it is not for images pre jewel.
>>> The only difference i can find is that the images created with jewel
>>> also support deep-flatten.
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 11.05.2017 um 22:28 schrieb Jason Dillaman:
 Assuming the only log messages you are seeing are the following:

 2017-05-06 03:20:50.830626 7f7876a64700 -1
 librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
 object map in-memory
 2017-05-06 03:20:50.830634 7f7876a64700 -1
 librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
 object map on-disk
 2017-05-06 03:20:50.831250 7f7877265700 -1
 librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0

 It looks like that can only occur if somehow the object-map on disk is
 larger than the actual image size. If that's the case, how the image
 got into that state is unknown to me at this point.

 On Thu, May 11, 2017 at 3:23 PM, Stefan Priebe - Profihost AG
  wrote:
> Hi Jason,
>
> it seems i can at least circumvent the crashes. Since i restarted ALL
> osds after enabling exclusive lock and rebuilding the object maps it had
> no new crashes.
>
> What still makes me wonder are those
> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0
>
> messages.
>
> Greets,
> Stefan
>
> Am 08.05.2017 um 14:50 schrieb Stefan Priebe - Profihost AG:
>> Hi,
>> Am 08.05.2017 um 14:40 schrieb Jason Dillaman:
>>> You are saying that you had v2 RBD images created against Hammer OSDs
>>> and client libraries where exclusive lock, object map, etc were never
>>> enabled. You then upgraded the OSDs and clients to Jewel and at some
>>> point enabled exclusive lock (and I'd assume object map) on these
>>> images
>>
>> Yes i did:
>> for img in $(rbd -p cephstor5 ls -l | grep -v "@" | awk '{ print $1 }');
>> do rbd -p cephstor5 feature enable $img
>> exclusive-lock,object-map,fast-diff || echo $img; done
>>
>>> -- or were the exclusive lock and object map features already
>>> enabled under Hammer?
>>
>> No as they were not the rbd defaults.
>>
>>> The fact that you encountered an object map error on an export
>>> operation is surprising to me.  Does that error re-occur if you
>>> perform the export again? If you can repeat it, it would be very
>>> helpful if you could run the export with "--debug-rbd=20" and capture
>>> the generated logs.
>>
>> No i can't repeat it. It happens every night but for different images.
>> But i never saw it for a vm twice. If i do he export again it works fine.
>>
>> I'm doing an rbd export or an rbd export-diff --from-snap it depends on
>> the VM and day since the last snapshot.
>>
>> Greets,
>> Stefan
>>
>>>
>>> On Sat, May 6, 2017 at 2:38 PM, Stefan Priebe - Profihost AG
>>>  wrote:
 Hi,

 also i'm getting these errors only for pre jewel images:

 2017-05-06 03:20:50.830626 7f7876a64700 -1
 

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-14 Thread Jason Dillaman
It appears as though there is client.27994090 at 10.255.0.13 that
currently owns the exclusive lock on that image. I am assuming the log
is from "rbd feature disable"? If so, I can see that it attempts to
acquire the lock and the other side is not appropriately responding to
the request.

Assuming your system is still in this state, is there any chance to
get debug rbd=20 logs from that client by using the client's asok file
and "ceph --admin-daemon /path/to/client/asok config set debug_rbd 20"
and re-run the attempt to disable exclusive lock? Also, what version
of Ceph is that client running?

Jason

On Sun, May 14, 2017 at 1:55 AM, Stefan Priebe - Profihost AG
 wrote:
> Hello Jason,
>
> as it still happens and VMs are crashing. I wanted to disable
> exclusive-lock,fast-diff again. But i detected that there are images
> where the rbd commands runs in an endless loop.
>
> I canceled the command after 60s and used --debug-rbd=20. Will send the
> log off list.
>
> Thanks!
>
> Greets,
> Stefan
>
> Am 13.05.2017 um 19:19 schrieb Stefan Priebe - Profihost AG:
>> Hello Jason,
>>
>> it seems to be related to fstrim and discard. I cannot reproduce it for
>> images were we don't use trim - but it's still the case it's working
>> fine for images created with jewel and it is not for images pre jewel.
>> The only difference i can find is that the images created with jewel
>> also support deep-flatten.
>>
>> Greets,
>> Stefan
>>
>> Am 11.05.2017 um 22:28 schrieb Jason Dillaman:
>>> Assuming the only log messages you are seeing are the following:
>>>
>>> 2017-05-06 03:20:50.830626 7f7876a64700 -1
>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>>> object map in-memory
>>> 2017-05-06 03:20:50.830634 7f7876a64700 -1
>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>>> object map on-disk
>>> 2017-05-06 03:20:50.831250 7f7877265700 -1
>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0
>>>
>>> It looks like that can only occur if somehow the object-map on disk is
>>> larger than the actual image size. If that's the case, how the image
>>> got into that state is unknown to me at this point.
>>>
>>> On Thu, May 11, 2017 at 3:23 PM, Stefan Priebe - Profihost AG
>>>  wrote:
 Hi Jason,

 it seems i can at least circumvent the crashes. Since i restarted ALL
 osds after enabling exclusive lock and rebuilding the object maps it had
 no new crashes.

 What still makes me wonder are those
 librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0

 messages.

 Greets,
 Stefan

 Am 08.05.2017 um 14:50 schrieb Stefan Priebe - Profihost AG:
> Hi,
> Am 08.05.2017 um 14:40 schrieb Jason Dillaman:
>> You are saying that you had v2 RBD images created against Hammer OSDs
>> and client libraries where exclusive lock, object map, etc were never
>> enabled. You then upgraded the OSDs and clients to Jewel and at some
>> point enabled exclusive lock (and I'd assume object map) on these
>> images
>
> Yes i did:
> for img in $(rbd -p cephstor5 ls -l | grep -v "@" | awk '{ print $1 }');
> do rbd -p cephstor5 feature enable $img
> exclusive-lock,object-map,fast-diff || echo $img; done
>
>> -- or were the exclusive lock and object map features already
>> enabled under Hammer?
>
> No as they were not the rbd defaults.
>
>> The fact that you encountered an object map error on an export
>> operation is surprising to me.  Does that error re-occur if you
>> perform the export again? If you can repeat it, it would be very
>> helpful if you could run the export with "--debug-rbd=20" and capture
>> the generated logs.
>
> No i can't repeat it. It happens every night but for different images.
> But i never saw it for a vm twice. If i do he export again it works fine.
>
> I'm doing an rbd export or an rbd export-diff --from-snap it depends on
> the VM and day since the last snapshot.
>
> Greets,
> Stefan
>
>>
>> On Sat, May 6, 2017 at 2:38 PM, Stefan Priebe - Profihost AG
>>  wrote:
>>> Hi,
>>>
>>> also i'm getting these errors only for pre jewel images:
>>>
>>> 2017-05-06 03:20:50.830626 7f7876a64700 -1
>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>>> object map in-memory
>>> 2017-05-06 03:20:50.830634 7f7876a64700 -1
>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>>> object map on-disk
>>> 2017-05-06 03:20:50.831250 7f7877265700 -1
>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: 
>>> r=0
>>>
>>> while running export-diff.
>>>
>>> Stefan
>>>
>>> Am 06.05.2017 um 07:37 schrieb Stefan Priebe - Profihost AG:
 Hello Json,

 while 

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-13 Thread Stefan Priebe - Profihost AG
Hello Jason,

as it still happens and VMs are crashing. I wanted to disable
exclusive-lock,fast-diff again. But i detected that there are images
where the rbd commands runs in an endless loop.

I canceled the command after 60s and used --debug-rbd=20. Will send the
log off list.

Thanks!

Greets,
Stefan

Am 13.05.2017 um 19:19 schrieb Stefan Priebe - Profihost AG:
> Hello Jason,
> 
> it seems to be related to fstrim and discard. I cannot reproduce it for
> images were we don't use trim - but it's still the case it's working
> fine for images created with jewel and it is not for images pre jewel.
> The only difference i can find is that the images created with jewel
> also support deep-flatten.
> 
> Greets,
> Stefan
> 
> Am 11.05.2017 um 22:28 schrieb Jason Dillaman:
>> Assuming the only log messages you are seeing are the following:
>>
>> 2017-05-06 03:20:50.830626 7f7876a64700 -1
>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>> object map in-memory
>> 2017-05-06 03:20:50.830634 7f7876a64700 -1
>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>> object map on-disk
>> 2017-05-06 03:20:50.831250 7f7877265700 -1
>> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0
>>
>> It looks like that can only occur if somehow the object-map on disk is
>> larger than the actual image size. If that's the case, how the image
>> got into that state is unknown to me at this point.
>>
>> On Thu, May 11, 2017 at 3:23 PM, Stefan Priebe - Profihost AG
>>  wrote:
>>> Hi Jason,
>>>
>>> it seems i can at least circumvent the crashes. Since i restarted ALL
>>> osds after enabling exclusive lock and rebuilding the object maps it had
>>> no new crashes.
>>>
>>> What still makes me wonder are those
>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0
>>>
>>> messages.
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 08.05.2017 um 14:50 schrieb Stefan Priebe - Profihost AG:
 Hi,
 Am 08.05.2017 um 14:40 schrieb Jason Dillaman:
> You are saying that you had v2 RBD images created against Hammer OSDs
> and client libraries where exclusive lock, object map, etc were never
> enabled. You then upgraded the OSDs and clients to Jewel and at some
> point enabled exclusive lock (and I'd assume object map) on these
> images

 Yes i did:
 for img in $(rbd -p cephstor5 ls -l | grep -v "@" | awk '{ print $1 }');
 do rbd -p cephstor5 feature enable $img
 exclusive-lock,object-map,fast-diff || echo $img; done

> -- or were the exclusive lock and object map features already
> enabled under Hammer?

 No as they were not the rbd defaults.

> The fact that you encountered an object map error on an export
> operation is surprising to me.  Does that error re-occur if you
> perform the export again? If you can repeat it, it would be very
> helpful if you could run the export with "--debug-rbd=20" and capture
> the generated logs.

 No i can't repeat it. It happens every night but for different images.
 But i never saw it for a vm twice. If i do he export again it works fine.

 I'm doing an rbd export or an rbd export-diff --from-snap it depends on
 the VM and day since the last snapshot.

 Greets,
 Stefan

>
> On Sat, May 6, 2017 at 2:38 PM, Stefan Priebe - Profihost AG
>  wrote:
>> Hi,
>>
>> also i'm getting these errors only for pre jewel images:
>>
>> 2017-05-06 03:20:50.830626 7f7876a64700 -1
>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>> object map in-memory
>> 2017-05-06 03:20:50.830634 7f7876a64700 -1
>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>> object map on-disk
>> 2017-05-06 03:20:50.831250 7f7877265700 -1
>> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: 
>> r=0
>>
>> while running export-diff.
>>
>> Stefan
>>
>> Am 06.05.2017 um 07:37 schrieb Stefan Priebe - Profihost AG:
>>> Hello Json,
>>>
>>> while doing further testing it happens only with images created with
>>> hammer and that got upgraded to jewel AND got enabled exclusive lock.
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 04.05.2017 um 14:20 schrieb Jason Dillaman:
 Odd. Can you re-run "rbd rm" with "--debug-rbd=20" added to the
 command and post the resulting log to a new ticket at [1]? I'd also be
 interested if you could re-create that
 "librbd::object_map::InvalidateRequest" issue repeatably.
 n
 [1] http://tracker.ceph.com/projects/rbd/issues

 On Thu, May 4, 2017 at 3:45 AM, Stefan Priebe - Profihost AG
  wrote:
> Example:
> # rbd rm cephstor2/vm-136-disk-1
> Removing image: 99% complete...
>
> Stuck at 

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-13 Thread Stefan Priebe - Profihost AG
Hello Jason,

it seems to be related to fstrim and discard. I cannot reproduce it for
images were we don't use trim - but it's still the case it's working
fine for images created with jewel and it is not for images pre jewel.
The only difference i can find is that the images created with jewel
also support deep-flatten.

Greets,
Stefan

Am 11.05.2017 um 22:28 schrieb Jason Dillaman:
> Assuming the only log messages you are seeing are the following:
> 
> 2017-05-06 03:20:50.830626 7f7876a64700 -1
> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
> object map in-memory
> 2017-05-06 03:20:50.830634 7f7876a64700 -1
> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
> object map on-disk
> 2017-05-06 03:20:50.831250 7f7877265700 -1
> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0
> 
> It looks like that can only occur if somehow the object-map on disk is
> larger than the actual image size. If that's the case, how the image
> got into that state is unknown to me at this point.
> 
> On Thu, May 11, 2017 at 3:23 PM, Stefan Priebe - Profihost AG
>  wrote:
>> Hi Jason,
>>
>> it seems i can at least circumvent the crashes. Since i restarted ALL
>> osds after enabling exclusive lock and rebuilding the object maps it had
>> no new crashes.
>>
>> What still makes me wonder are those
>> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0
>>
>> messages.
>>
>> Greets,
>> Stefan
>>
>> Am 08.05.2017 um 14:50 schrieb Stefan Priebe - Profihost AG:
>>> Hi,
>>> Am 08.05.2017 um 14:40 schrieb Jason Dillaman:
 You are saying that you had v2 RBD images created against Hammer OSDs
 and client libraries where exclusive lock, object map, etc were never
 enabled. You then upgraded the OSDs and clients to Jewel and at some
 point enabled exclusive lock (and I'd assume object map) on these
 images
>>>
>>> Yes i did:
>>> for img in $(rbd -p cephstor5 ls -l | grep -v "@" | awk '{ print $1 }');
>>> do rbd -p cephstor5 feature enable $img
>>> exclusive-lock,object-map,fast-diff || echo $img; done
>>>
 -- or were the exclusive lock and object map features already
 enabled under Hammer?
>>>
>>> No as they were not the rbd defaults.
>>>
 The fact that you encountered an object map error on an export
 operation is surprising to me.  Does that error re-occur if you
 perform the export again? If you can repeat it, it would be very
 helpful if you could run the export with "--debug-rbd=20" and capture
 the generated logs.
>>>
>>> No i can't repeat it. It happens every night but for different images.
>>> But i never saw it for a vm twice. If i do he export again it works fine.
>>>
>>> I'm doing an rbd export or an rbd export-diff --from-snap it depends on
>>> the VM and day since the last snapshot.
>>>
>>> Greets,
>>> Stefan
>>>

 On Sat, May 6, 2017 at 2:38 PM, Stefan Priebe - Profihost AG
  wrote:
> Hi,
>
> also i'm getting these errors only for pre jewel images:
>
> 2017-05-06 03:20:50.830626 7f7876a64700 -1
> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
> object map in-memory
> 2017-05-06 03:20:50.830634 7f7876a64700 -1
> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
> object map on-disk
> 2017-05-06 03:20:50.831250 7f7877265700 -1
> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0
>
> while running export-diff.
>
> Stefan
>
> Am 06.05.2017 um 07:37 schrieb Stefan Priebe - Profihost AG:
>> Hello Json,
>>
>> while doing further testing it happens only with images created with
>> hammer and that got upgraded to jewel AND got enabled exclusive lock.
>>
>> Greets,
>> Stefan
>>
>> Am 04.05.2017 um 14:20 schrieb Jason Dillaman:
>>> Odd. Can you re-run "rbd rm" with "--debug-rbd=20" added to the
>>> command and post the resulting log to a new ticket at [1]? I'd also be
>>> interested if you could re-create that
>>> "librbd::object_map::InvalidateRequest" issue repeatably.
>>> n
>>> [1] http://tracker.ceph.com/projects/rbd/issues
>>>
>>> On Thu, May 4, 2017 at 3:45 AM, Stefan Priebe - Profihost AG
>>>  wrote:
 Example:
 # rbd rm cephstor2/vm-136-disk-1
 Removing image: 99% complete...

 Stuck at 99% and never completes. This is an image which got corrupted
 for an unknown reason.

 Greets,
 Stefan

 Am 04.05.2017 um 08:32 schrieb Stefan Priebe - Profihost AG:
> I'm not sure whether this is related but our backup system uses rbd
> snapshots and reports sometimes messages like these:
> 2017-05-04 02:42:47.661263 7f3316ffd700 -1
> librbd::object_map::InvalidateRequest: 0x7f3310002570 
> should_complete: r=0

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-11 Thread Jason Dillaman
Assuming the only log messages you are seeing are the following:

2017-05-06 03:20:50.830626 7f7876a64700 -1
librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
object map in-memory
2017-05-06 03:20:50.830634 7f7876a64700 -1
librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
object map on-disk
2017-05-06 03:20:50.831250 7f7877265700 -1
librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0

It looks like that can only occur if somehow the object-map on disk is
larger than the actual image size. If that's the case, how the image
got into that state is unknown to me at this point.

On Thu, May 11, 2017 at 3:23 PM, Stefan Priebe - Profihost AG
 wrote:
> Hi Jason,
>
> it seems i can at least circumvent the crashes. Since i restarted ALL
> osds after enabling exclusive lock and rebuilding the object maps it had
> no new crashes.
>
> What still makes me wonder are those
> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0
>
> messages.
>
> Greets,
> Stefan
>
> Am 08.05.2017 um 14:50 schrieb Stefan Priebe - Profihost AG:
>> Hi,
>> Am 08.05.2017 um 14:40 schrieb Jason Dillaman:
>>> You are saying that you had v2 RBD images created against Hammer OSDs
>>> and client libraries where exclusive lock, object map, etc were never
>>> enabled. You then upgraded the OSDs and clients to Jewel and at some
>>> point enabled exclusive lock (and I'd assume object map) on these
>>> images
>>
>> Yes i did:
>> for img in $(rbd -p cephstor5 ls -l | grep -v "@" | awk '{ print $1 }');
>> do rbd -p cephstor5 feature enable $img
>> exclusive-lock,object-map,fast-diff || echo $img; done
>>
>>> -- or were the exclusive lock and object map features already
>>> enabled under Hammer?
>>
>> No as they were not the rbd defaults.
>>
>>> The fact that you encountered an object map error on an export
>>> operation is surprising to me.  Does that error re-occur if you
>>> perform the export again? If you can repeat it, it would be very
>>> helpful if you could run the export with "--debug-rbd=20" and capture
>>> the generated logs.
>>
>> No i can't repeat it. It happens every night but for different images.
>> But i never saw it for a vm twice. If i do he export again it works fine.
>>
>> I'm doing an rbd export or an rbd export-diff --from-snap it depends on
>> the VM and day since the last snapshot.
>>
>> Greets,
>> Stefan
>>
>>>
>>> On Sat, May 6, 2017 at 2:38 PM, Stefan Priebe - Profihost AG
>>>  wrote:
 Hi,

 also i'm getting these errors only for pre jewel images:

 2017-05-06 03:20:50.830626 7f7876a64700 -1
 librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
 object map in-memory
 2017-05-06 03:20:50.830634 7f7876a64700 -1
 librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
 object map on-disk
 2017-05-06 03:20:50.831250 7f7877265700 -1
 librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0

 while running export-diff.

 Stefan

 Am 06.05.2017 um 07:37 schrieb Stefan Priebe - Profihost AG:
> Hello Json,
>
> while doing further testing it happens only with images created with
> hammer and that got upgraded to jewel AND got enabled exclusive lock.
>
> Greets,
> Stefan
>
> Am 04.05.2017 um 14:20 schrieb Jason Dillaman:
>> Odd. Can you re-run "rbd rm" with "--debug-rbd=20" added to the
>> command and post the resulting log to a new ticket at [1]? I'd also be
>> interested if you could re-create that
>> "librbd::object_map::InvalidateRequest" issue repeatably.
>> n
>> [1] http://tracker.ceph.com/projects/rbd/issues
>>
>> On Thu, May 4, 2017 at 3:45 AM, Stefan Priebe - Profihost AG
>>  wrote:
>>> Example:
>>> # rbd rm cephstor2/vm-136-disk-1
>>> Removing image: 99% complete...
>>>
>>> Stuck at 99% and never completes. This is an image which got corrupted
>>> for an unknown reason.
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 04.05.2017 um 08:32 schrieb Stefan Priebe - Profihost AG:
 I'm not sure whether this is related but our backup system uses rbd
 snapshots and reports sometimes messages like these:
 2017-05-04 02:42:47.661263 7f3316ffd700 -1
 librbd::object_map::InvalidateRequest: 0x7f3310002570 should_complete: 
 r=0

 Stefan


 Am 04.05.2017 um 07:49 schrieb Stefan Priebe - Profihost AG:
> Hello,
>
> since we've upgraded from hammer to jewel 10.2.7 and enabled
> exclusive-lock,object-map,fast-diff we've problems with corrupting VM
> filesystems.
>
> Sometimes the VMs are just crashing with FS errors and a restart can
> solve the problem. Sometimes the whole VM is not even bootable and we
> need to import a backup.

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-11 Thread Stefan Priebe - Profihost AG
Hi Jason,

it seems i can at least circumvent the crashes. Since i restarted ALL
osds after enabling exclusive lock and rebuilding the object maps it had
no new crashes.

What still makes me wonder are those
librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0

messages.

Greets,
Stefan

Am 08.05.2017 um 14:50 schrieb Stefan Priebe - Profihost AG:
> Hi,
> Am 08.05.2017 um 14:40 schrieb Jason Dillaman:
>> You are saying that you had v2 RBD images created against Hammer OSDs
>> and client libraries where exclusive lock, object map, etc were never
>> enabled. You then upgraded the OSDs and clients to Jewel and at some
>> point enabled exclusive lock (and I'd assume object map) on these
>> images
> 
> Yes i did:
> for img in $(rbd -p cephstor5 ls -l | grep -v "@" | awk '{ print $1 }');
> do rbd -p cephstor5 feature enable $img
> exclusive-lock,object-map,fast-diff || echo $img; done
> 
>> -- or were the exclusive lock and object map features already
>> enabled under Hammer?
> 
> No as they were not the rbd defaults.
> 
>> The fact that you encountered an object map error on an export
>> operation is surprising to me.  Does that error re-occur if you
>> perform the export again? If you can repeat it, it would be very
>> helpful if you could run the export with "--debug-rbd=20" and capture
>> the generated logs.
> 
> No i can't repeat it. It happens every night but for different images.
> But i never saw it for a vm twice. If i do he export again it works fine.
> 
> I'm doing an rbd export or an rbd export-diff --from-snap it depends on
> the VM and day since the last snapshot.
> 
> Greets,
> Stefan
> 
>>
>> On Sat, May 6, 2017 at 2:38 PM, Stefan Priebe - Profihost AG
>>  wrote:
>>> Hi,
>>>
>>> also i'm getting these errors only for pre jewel images:
>>>
>>> 2017-05-06 03:20:50.830626 7f7876a64700 -1
>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>>> object map in-memory
>>> 2017-05-06 03:20:50.830634 7f7876a64700 -1
>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>>> object map on-disk
>>> 2017-05-06 03:20:50.831250 7f7877265700 -1
>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0
>>>
>>> while running export-diff.
>>>
>>> Stefan
>>>
>>> Am 06.05.2017 um 07:37 schrieb Stefan Priebe - Profihost AG:
 Hello Json,

 while doing further testing it happens only with images created with
 hammer and that got upgraded to jewel AND got enabled exclusive lock.

 Greets,
 Stefan

 Am 04.05.2017 um 14:20 schrieb Jason Dillaman:
> Odd. Can you re-run "rbd rm" with "--debug-rbd=20" added to the
> command and post the resulting log to a new ticket at [1]? I'd also be
> interested if you could re-create that
> "librbd::object_map::InvalidateRequest" issue repeatably.
> n
> [1] http://tracker.ceph.com/projects/rbd/issues
>
> On Thu, May 4, 2017 at 3:45 AM, Stefan Priebe - Profihost AG
>  wrote:
>> Example:
>> # rbd rm cephstor2/vm-136-disk-1
>> Removing image: 99% complete...
>>
>> Stuck at 99% and never completes. This is an image which got corrupted
>> for an unknown reason.
>>
>> Greets,
>> Stefan
>>
>> Am 04.05.2017 um 08:32 schrieb Stefan Priebe - Profihost AG:
>>> I'm not sure whether this is related but our backup system uses rbd
>>> snapshots and reports sometimes messages like these:
>>> 2017-05-04 02:42:47.661263 7f3316ffd700 -1
>>> librbd::object_map::InvalidateRequest: 0x7f3310002570 should_complete: 
>>> r=0
>>>
>>> Stefan
>>>
>>>
>>> Am 04.05.2017 um 07:49 schrieb Stefan Priebe - Profihost AG:
 Hello,

 since we've upgraded from hammer to jewel 10.2.7 and enabled
 exclusive-lock,object-map,fast-diff we've problems with corrupting VM
 filesystems.

 Sometimes the VMs are just crashing with FS errors and a restart can
 solve the problem. Sometimes the whole VM is not even bootable and we
 need to import a backup.

 All of them have the same problem that you can't revert to an older
 snapshot. The rbd command just hangs at 99% forever.

 Is this a known issue - anythink we can check?

 Greets,
 Stefan

>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>>
>>
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-08 Thread Stefan Priebe - Profihost AG
Hi Peter,



Am 08.05.2017 um 15:23 schrieb Peter Maloney:
> On 05/08/17 14:50, Stefan Priebe - Profihost AG wrote:
>> Hi,
>> Am 08.05.2017 um 14:40 schrieb Jason Dillaman:
>>> You are saying that you had v2 RBD images created against Hammer OSDs
>>> and client libraries where exclusive lock, object map, etc were never
>>> enabled. You then upgraded the OSDs and clients to Jewel and at some
>>> point enabled exclusive lock (and I'd assume object map) on these
>>> images
>> Yes i did:
>> for img in $(rbd -p cephstor5 ls -l | grep -v "@" | awk '{ print $1 }');
>> do rbd -p cephstor5 feature enable $img
>> exclusive-lock,object-map,fast-diff || echo $img; done
> If that's all you did, and not also
> rbd object-map rebuild $img

sorry we did a rbd object-map rebuild as well in a 2nd run. I just
missed this command. Sorry.

> And in my testing, this randomly doesn't work, often saying the object
> map is invalid after some point (in rbd info output), unless you do
> something... maybe disconnect all clients, then rebuild it, then clients
> can connect again, or just repeat a few times until it stays. I haven't
> tested that well.

no i've not disconnected the clients. I've seen no information that the
clients need to do a reconnect. I also have no idea how to force a
reconnect while running qemu.

Greets,
Stefan

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-08 Thread Peter Maloney
On 05/08/17 14:50, Stefan Priebe - Profihost AG wrote:
> Hi,
> Am 08.05.2017 um 14:40 schrieb Jason Dillaman:
>> You are saying that you had v2 RBD images created against Hammer OSDs
>> and client libraries where exclusive lock, object map, etc were never
>> enabled. You then upgraded the OSDs and clients to Jewel and at some
>> point enabled exclusive lock (and I'd assume object map) on these
>> images
> Yes i did:
> for img in $(rbd -p cephstor5 ls -l | grep -v "@" | awk '{ print $1 }');
> do rbd -p cephstor5 feature enable $img
> exclusive-lock,object-map,fast-diff || echo $img; done
If that's all you did, and not also
rbd object-map rebuild $img

Then it won't have a proper object map, which is probably more like
being disabled rather than more bugged.

And in my testing, this randomly doesn't work, often saying the object
map is invalid after some point (in rbd info output), unless you do
something... maybe disconnect all clients, then rebuild it, then clients
can connect again, or just repeat a few times until it stays. I haven't
tested that well.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-08 Thread Stefan Priebe - Profihost AG
Hi,
Am 08.05.2017 um 14:40 schrieb Jason Dillaman:
> You are saying that you had v2 RBD images created against Hammer OSDs
> and client libraries where exclusive lock, object map, etc were never
> enabled. You then upgraded the OSDs and clients to Jewel and at some
> point enabled exclusive lock (and I'd assume object map) on these
> images

Yes i did:
for img in $(rbd -p cephstor5 ls -l | grep -v "@" | awk '{ print $1 }');
do rbd -p cephstor5 feature enable $img
exclusive-lock,object-map,fast-diff || echo $img; done

> -- or were the exclusive lock and object map features already
> enabled under Hammer?

No as they were not the rbd defaults.

> The fact that you encountered an object map error on an export
> operation is surprising to me.  Does that error re-occur if you
> perform the export again? If you can repeat it, it would be very
> helpful if you could run the export with "--debug-rbd=20" and capture
> the generated logs.

No i can't repeat it. It happens every night but for different images.
But i never saw it for a vm twice. If i do he export again it works fine.

I'm doing an rbd export or an rbd export-diff --from-snap it depends on
the VM and day since the last snapshot.

Greets,
Stefan

> 
> On Sat, May 6, 2017 at 2:38 PM, Stefan Priebe - Profihost AG
>  wrote:
>> Hi,
>>
>> also i'm getting these errors only for pre jewel images:
>>
>> 2017-05-06 03:20:50.830626 7f7876a64700 -1
>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>> object map in-memory
>> 2017-05-06 03:20:50.830634 7f7876a64700 -1
>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>> object map on-disk
>> 2017-05-06 03:20:50.831250 7f7877265700 -1
>> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0
>>
>> while running export-diff.
>>
>> Stefan
>>
>> Am 06.05.2017 um 07:37 schrieb Stefan Priebe - Profihost AG:
>>> Hello Json,
>>>
>>> while doing further testing it happens only with images created with
>>> hammer and that got upgraded to jewel AND got enabled exclusive lock.
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 04.05.2017 um 14:20 schrieb Jason Dillaman:
 Odd. Can you re-run "rbd rm" with "--debug-rbd=20" added to the
 command and post the resulting log to a new ticket at [1]? I'd also be
 interested if you could re-create that
 "librbd::object_map::InvalidateRequest" issue repeatably.
 n
 [1] http://tracker.ceph.com/projects/rbd/issues

 On Thu, May 4, 2017 at 3:45 AM, Stefan Priebe - Profihost AG
  wrote:
> Example:
> # rbd rm cephstor2/vm-136-disk-1
> Removing image: 99% complete...
>
> Stuck at 99% and never completes. This is an image which got corrupted
> for an unknown reason.
>
> Greets,
> Stefan
>
> Am 04.05.2017 um 08:32 schrieb Stefan Priebe - Profihost AG:
>> I'm not sure whether this is related but our backup system uses rbd
>> snapshots and reports sometimes messages like these:
>> 2017-05-04 02:42:47.661263 7f3316ffd700 -1
>> librbd::object_map::InvalidateRequest: 0x7f3310002570 should_complete: 
>> r=0
>>
>> Stefan
>>
>>
>> Am 04.05.2017 um 07:49 schrieb Stefan Priebe - Profihost AG:
>>> Hello,
>>>
>>> since we've upgraded from hammer to jewel 10.2.7 and enabled
>>> exclusive-lock,object-map,fast-diff we've problems with corrupting VM
>>> filesystems.
>>>
>>> Sometimes the VMs are just crashing with FS errors and a restart can
>>> solve the problem. Sometimes the whole VM is not even bootable and we
>>> need to import a backup.
>>>
>>> All of them have the same problem that you can't revert to an older
>>> snapshot. The rbd command just hangs at 99% forever.
>>>
>>> Is this a known issue - anythink we can check?
>>>
>>> Greets,
>>> Stefan
>>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



> 
> 
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-08 Thread Jason Dillaman
You are saying that you had v2 RBD images created against Hammer OSDs
and client libraries where exclusive lock, object map, etc were never
enabled. You then upgraded the OSDs and clients to Jewel and at some
point enabled exclusive lock (and I'd assume object map) on these
images -- or were the exclusive lock and object map features already
enabled under Hammer?

The fact that you encountered an object map error on an export
operation is surprising to me.  Does that error re-occur if you
perform the export again? If you can repeat it, it would be very
helpful if you could run the export with "--debug-rbd=20" and capture
the generated logs.

On Sat, May 6, 2017 at 2:38 PM, Stefan Priebe - Profihost AG
 wrote:
> Hi,
>
> also i'm getting these errors only for pre jewel images:
>
> 2017-05-06 03:20:50.830626 7f7876a64700 -1
> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
> object map in-memory
> 2017-05-06 03:20:50.830634 7f7876a64700 -1
> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
> object map on-disk
> 2017-05-06 03:20:50.831250 7f7877265700 -1
> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0
>
> while running export-diff.
>
> Stefan
>
> Am 06.05.2017 um 07:37 schrieb Stefan Priebe - Profihost AG:
>> Hello Json,
>>
>> while doing further testing it happens only with images created with
>> hammer and that got upgraded to jewel AND got enabled exclusive lock.
>>
>> Greets,
>> Stefan
>>
>> Am 04.05.2017 um 14:20 schrieb Jason Dillaman:
>>> Odd. Can you re-run "rbd rm" with "--debug-rbd=20" added to the
>>> command and post the resulting log to a new ticket at [1]? I'd also be
>>> interested if you could re-create that
>>> "librbd::object_map::InvalidateRequest" issue repeatably.
>>>n
>>> [1] http://tracker.ceph.com/projects/rbd/issues
>>>
>>> On Thu, May 4, 2017 at 3:45 AM, Stefan Priebe - Profihost AG
>>>  wrote:
 Example:
 # rbd rm cephstor2/vm-136-disk-1
 Removing image: 99% complete...

 Stuck at 99% and never completes. This is an image which got corrupted
 for an unknown reason.

 Greets,
 Stefan

 Am 04.05.2017 um 08:32 schrieb Stefan Priebe - Profihost AG:
> I'm not sure whether this is related but our backup system uses rbd
> snapshots and reports sometimes messages like these:
> 2017-05-04 02:42:47.661263 7f3316ffd700 -1
> librbd::object_map::InvalidateRequest: 0x7f3310002570 should_complete: r=0
>
> Stefan
>
>
> Am 04.05.2017 um 07:49 schrieb Stefan Priebe - Profihost AG:
>> Hello,
>>
>> since we've upgraded from hammer to jewel 10.2.7 and enabled
>> exclusive-lock,object-map,fast-diff we've problems with corrupting VM
>> filesystems.
>>
>> Sometimes the VMs are just crashing with FS errors and a restart can
>> solve the problem. Sometimes the whole VM is not even bootable and we
>> need to import a backup.
>>
>> All of them have the same problem that you can't revert to an older
>> snapshot. The rbd command just hangs at 99% forever.
>>
>> Is this a known issue - anythink we can check?
>>
>> Greets,
>> Stefan
>>
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>>>



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-06 Thread Stefan Priebe - Profihost AG
Hi,

also i'm getting these errors only for pre jewel images:

2017-05-06 03:20:50.830626 7f7876a64700 -1
librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
object map in-memory
2017-05-06 03:20:50.830634 7f7876a64700 -1
librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
object map on-disk
2017-05-06 03:20:50.831250 7f7877265700 -1
librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0

while running export-diff.

Stefan

Am 06.05.2017 um 07:37 schrieb Stefan Priebe - Profihost AG:
> Hello Json,
> 
> while doing further testing it happens only with images created with
> hammer and that got upgraded to jewel AND got enabled exclusive lock.
> 
> Greets,
> Stefan
> 
> Am 04.05.2017 um 14:20 schrieb Jason Dillaman:
>> Odd. Can you re-run "rbd rm" with "--debug-rbd=20" added to the
>> command and post the resulting log to a new ticket at [1]? I'd also be
>> interested if you could re-create that
>> "librbd::object_map::InvalidateRequest" issue repeatably.
>>n
>> [1] http://tracker.ceph.com/projects/rbd/issues
>>
>> On Thu, May 4, 2017 at 3:45 AM, Stefan Priebe - Profihost AG
>>  wrote:
>>> Example:
>>> # rbd rm cephstor2/vm-136-disk-1
>>> Removing image: 99% complete...
>>>
>>> Stuck at 99% and never completes. This is an image which got corrupted
>>> for an unknown reason.
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 04.05.2017 um 08:32 schrieb Stefan Priebe - Profihost AG:
 I'm not sure whether this is related but our backup system uses rbd
 snapshots and reports sometimes messages like these:
 2017-05-04 02:42:47.661263 7f3316ffd700 -1
 librbd::object_map::InvalidateRequest: 0x7f3310002570 should_complete: r=0

 Stefan


 Am 04.05.2017 um 07:49 schrieb Stefan Priebe - Profihost AG:
> Hello,
>
> since we've upgraded from hammer to jewel 10.2.7 and enabled
> exclusive-lock,object-map,fast-diff we've problems with corrupting VM
> filesystems.
>
> Sometimes the VMs are just crashing with FS errors and a restart can
> solve the problem. Sometimes the whole VM is not even bootable and we
> need to import a backup.
>
> All of them have the same problem that you can't revert to an older
> snapshot. The rbd command just hangs at 99% forever.
>
> Is this a known issue - anythink we can check?
>
> Greets,
> Stefan
>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-05 Thread Stefan Priebe - Profihost AG
Hello Json,

while doing further testing it happens only with images created with
hammer and that got upgraded to jewel AND got enabled exclusive lock.

Greets,
Stefan

Am 04.05.2017 um 14:20 schrieb Jason Dillaman:
> Odd. Can you re-run "rbd rm" with "--debug-rbd=20" added to the
> command and post the resulting log to a new ticket at [1]? I'd also be
> interested if you could re-create that
> "librbd::object_map::InvalidateRequest" issue repeatably.
> 
> [1] http://tracker.ceph.com/projects/rbd/issues
> 
> On Thu, May 4, 2017 at 3:45 AM, Stefan Priebe - Profihost AG
>  wrote:
>> Example:
>> # rbd rm cephstor2/vm-136-disk-1
>> Removing image: 99% complete...
>>
>> Stuck at 99% and never completes. This is an image which got corrupted
>> for an unknown reason.
>>
>> Greets,
>> Stefan
>>
>> Am 04.05.2017 um 08:32 schrieb Stefan Priebe - Profihost AG:
>>> I'm not sure whether this is related but our backup system uses rbd
>>> snapshots and reports sometimes messages like these:
>>> 2017-05-04 02:42:47.661263 7f3316ffd700 -1
>>> librbd::object_map::InvalidateRequest: 0x7f3310002570 should_complete: r=0
>>>
>>> Stefan
>>>
>>>
>>> Am 04.05.2017 um 07:49 schrieb Stefan Priebe - Profihost AG:
 Hello,

 since we've upgraded from hammer to jewel 10.2.7 and enabled
 exclusive-lock,object-map,fast-diff we've problems with corrupting VM
 filesystems.

 Sometimes the VMs are just crashing with FS errors and a restart can
 solve the problem. Sometimes the whole VM is not even bootable and we
 need to import a backup.

 All of them have the same problem that you can't revert to an older
 snapshot. The rbd command just hangs at 99% forever.

 Is this a known issue - anythink we can check?

 Greets,
 Stefan

>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-04 Thread Brian Andrus
Hi Stefan - we simply disabled exclusive-lock on all older (pre-jewel)
images. We still allow the default jewel featuresets for newly created
images because as you mention - the issue does not seem to affect them.

On Thu, May 4, 2017 at 10:19 AM, Stefan Priebe - Profihost AG <
s.pri...@profihost.ag> wrote:

> Hello Brian,
>
> this really sounds the same. I don't see this on a cluster with only
> images created AFTER jewel. And it seems to start happening after i
> enabled exclusive lock on all images.
>
> Did just use feature disable, exclusive-lock,fast-diff,object-map or did
> you also restart all those vms?
>
> Greets,
> Stefan
>
> Am 04.05.2017 um 19:11 schrieb Brian Andrus:
> > Sounds familiar... and discussed in "disk timeouts in libvirt/qemu
> VMs..."
> >
> > We have not had this issue since reverting exclusive-lock, but it was
> > suggested this was not the issue. So far it's held up for us with not a
> > single corrupt filesystem since then.
> >
> > On some images (ones created post-Jewel upgrade) the feature could not
> > be disabled, but these don't seem to be affected. Of course, we never
> > did pinpoint the cause of timeouts, so it's entirely possible something
> > else was causing it but no other major changes went into effect.
> >
> > One thing to look for that might confirm the same issue are timeouts in
> > the guest VM. Most OS kernel will report a hung task in conjunction with
> > the hang up/lock/corruption. Wondering if you're seeing that too.
> >
> > On Wed, May 3, 2017 at 10:49 PM, Stefan Priebe - Profihost AG
> > > wrote:
> >
> > Hello,
> >
> > since we've upgraded from hammer to jewel 10.2.7 and enabled
> > exclusive-lock,object-map,fast-diff we've problems with corrupting
> VM
> > filesystems.
> >
> > Sometimes the VMs are just crashing with FS errors and a restart can
> > solve the problem. Sometimes the whole VM is not even bootable and we
> > need to import a backup.
> >
> > All of them have the same problem that you can't revert to an older
> > snapshot. The rbd command just hangs at 99% forever.
> >
> > Is this a known issue - anythink we can check?
> >
> > Greets,
> > Stefan
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com 
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > 
> >
> >
> >
> >
> > --
> > Brian Andrus | Cloud Systems Engineer | DreamHost
> > brian.and...@dreamhost.com | www.dreamhost.com  >
>



-- 
Brian Andrus | Cloud Systems Engineer | DreamHost
brian.and...@dreamhost.com | www.dreamhost.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-04 Thread Stefan Priebe - Profihost AG
and yes i also see hung tasks in those VMs until they crash.

Stefan

Am 04.05.2017 um 19:11 schrieb Brian Andrus:
> Sounds familiar... and discussed in "disk timeouts in libvirt/qemu VMs..."
> 
> We have not had this issue since reverting exclusive-lock, but it was
> suggested this was not the issue. So far it's held up for us with not a
> single corrupt filesystem since then.
> 
> On some images (ones created post-Jewel upgrade) the feature could not
> be disabled, but these don't seem to be affected. Of course, we never
> did pinpoint the cause of timeouts, so it's entirely possible something
> else was causing it but no other major changes went into effect. 
> 
> One thing to look for that might confirm the same issue are timeouts in
> the guest VM. Most OS kernel will report a hung task in conjunction with
> the hang up/lock/corruption. Wondering if you're seeing that too.
> 
> On Wed, May 3, 2017 at 10:49 PM, Stefan Priebe - Profihost AG
> > wrote:
> 
> Hello,
> 
> since we've upgraded from hammer to jewel 10.2.7 and enabled
> exclusive-lock,object-map,fast-diff we've problems with corrupting VM
> filesystems.
> 
> Sometimes the VMs are just crashing with FS errors and a restart can
> solve the problem. Sometimes the whole VM is not even bootable and we
> need to import a backup.
> 
> All of them have the same problem that you can't revert to an older
> snapshot. The rbd command just hangs at 99% forever.
> 
> Is this a known issue - anythink we can check?
> 
> Greets,
> Stefan
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
> 
> 
> -- 
> Brian Andrus | Cloud Systems Engineer | DreamHost
> brian.and...@dreamhost.com | www.dreamhost.com 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-04 Thread Stefan Priebe - Profihost AG
Hello Brian,

this really sounds the same. I don't see this on a cluster with only
images created AFTER jewel. And it seems to start happening after i
enabled exclusive lock on all images.

Did just use feature disable, exclusive-lock,fast-diff,object-map or did
you also restart all those vms?

Greets,
Stefan

Am 04.05.2017 um 19:11 schrieb Brian Andrus:
> Sounds familiar... and discussed in "disk timeouts in libvirt/qemu VMs..."
> 
> We have not had this issue since reverting exclusive-lock, but it was
> suggested this was not the issue. So far it's held up for us with not a
> single corrupt filesystem since then.
> 
> On some images (ones created post-Jewel upgrade) the feature could not
> be disabled, but these don't seem to be affected. Of course, we never
> did pinpoint the cause of timeouts, so it's entirely possible something
> else was causing it but no other major changes went into effect. 
> 
> One thing to look for that might confirm the same issue are timeouts in
> the guest VM. Most OS kernel will report a hung task in conjunction with
> the hang up/lock/corruption. Wondering if you're seeing that too.
> 
> On Wed, May 3, 2017 at 10:49 PM, Stefan Priebe - Profihost AG
> > wrote:
> 
> Hello,
> 
> since we've upgraded from hammer to jewel 10.2.7 and enabled
> exclusive-lock,object-map,fast-diff we've problems with corrupting VM
> filesystems.
> 
> Sometimes the VMs are just crashing with FS errors and a restart can
> solve the problem. Sometimes the whole VM is not even bootable and we
> need to import a backup.
> 
> All of them have the same problem that you can't revert to an older
> snapshot. The rbd command just hangs at 99% forever.
> 
> Is this a known issue - anythink we can check?
> 
> Greets,
> Stefan
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
> 
> 
> -- 
> Brian Andrus | Cloud Systems Engineer | DreamHost
> brian.and...@dreamhost.com | www.dreamhost.com 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-04 Thread Brian Andrus
Sounds familiar... and discussed in "disk timeouts in libvirt/qemu VMs..."

We have not had this issue since reverting exclusive-lock, but it was
suggested this was not the issue. So far it's held up for us with not a
single corrupt filesystem since then.

On some images (ones created post-Jewel upgrade) the feature could not be
disabled, but these don't seem to be affected. Of course, we never did
pinpoint the cause of timeouts, so it's entirely possible something else
was causing it but no other major changes went into effect.

One thing to look for that might confirm the same issue are timeouts in the
guest VM. Most OS kernel will report a hung task in conjunction with the
hang up/lock/corruption. Wondering if you're seeing that too.

On Wed, May 3, 2017 at 10:49 PM, Stefan Priebe - Profihost AG <
s.pri...@profihost.ag> wrote:

> Hello,
>
> since we've upgraded from hammer to jewel 10.2.7 and enabled
> exclusive-lock,object-map,fast-diff we've problems with corrupting VM
> filesystems.
>
> Sometimes the VMs are just crashing with FS errors and a restart can
> solve the problem. Sometimes the whole VM is not even bootable and we
> need to import a backup.
>
> All of them have the same problem that you can't revert to an older
> snapshot. The rbd command just hangs at 99% forever.
>
> Is this a known issue - anythink we can check?
>
> Greets,
> Stefan
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Brian Andrus | Cloud Systems Engineer | DreamHost
brian.and...@dreamhost.com | www.dreamhost.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-04 Thread Stefan Priebe - Profihost AG
Hi Jason,

> Odd. Can you re-run "rbd rm" with "--debug-rbd=20" added to the
> command and post the resulting log to a new ticket at [1]?

will do so next time. I was able to solve this by restarting all osds.
After that i was able to successfuly delete the image.

> I'd also be interested if you could re-create that
> "librbd::object_map::InvalidateRequest" issue repeatably.
No i can't. It happens randomly for 5-10 images out of 2000.

Greets,
Stefan

Am 04.05.2017 um 14:20 schrieb Jason Dillaman:
> Odd. Can you re-run "rbd rm" with "--debug-rbd=20" added to the
> command and post the resulting log to a new ticket at [1]? I'd also be
> interested if you could re-create that
> "librbd::object_map::InvalidateRequest" issue repeatably.
> 
> [1] http://tracker.ceph.com/projects/rbd/issues
> 
> On Thu, May 4, 2017 at 3:45 AM, Stefan Priebe - Profihost AG
>  wrote:
>> Example:
>> # rbd rm cephstor2/vm-136-disk-1
>> Removing image: 99% complete...
>>
>> Stuck at 99% and never completes. This is an image which got corrupted
>> for an unknown reason.
>>
>> Greets,
>> Stefan
>>
>> Am 04.05.2017 um 08:32 schrieb Stefan Priebe - Profihost AG:
>>> I'm not sure whether this is related but our backup system uses rbd
>>> snapshots and reports sometimes messages like these:
>>> 2017-05-04 02:42:47.661263 7f3316ffd700 -1
>>> librbd::object_map::InvalidateRequest: 0x7f3310002570 should_complete: r=0
>>>
>>> Stefan
>>>
>>>
>>> Am 04.05.2017 um 07:49 schrieb Stefan Priebe - Profihost AG:
 Hello,

 since we've upgraded from hammer to jewel 10.2.7 and enabled
 exclusive-lock,object-map,fast-diff we've problems with corrupting VM
 filesystems.

 Sometimes the VMs are just crashing with FS errors and a restart can
 solve the problem. Sometimes the whole VM is not even bootable and we
 need to import a backup.

 All of them have the same problem that you can't revert to an older
 snapshot. The rbd command just hangs at 99% forever.

 Is this a known issue - anythink we can check?

 Greets,
 Stefan

>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-04 Thread Jason Dillaman
Odd. Can you re-run "rbd rm" with "--debug-rbd=20" added to the
command and post the resulting log to a new ticket at [1]? I'd also be
interested if you could re-create that
"librbd::object_map::InvalidateRequest" issue repeatably.

[1] http://tracker.ceph.com/projects/rbd/issues

On Thu, May 4, 2017 at 3:45 AM, Stefan Priebe - Profihost AG
 wrote:
> Example:
> # rbd rm cephstor2/vm-136-disk-1
> Removing image: 99% complete...
>
> Stuck at 99% and never completes. This is an image which got corrupted
> for an unknown reason.
>
> Greets,
> Stefan
>
> Am 04.05.2017 um 08:32 schrieb Stefan Priebe - Profihost AG:
>> I'm not sure whether this is related but our backup system uses rbd
>> snapshots and reports sometimes messages like these:
>> 2017-05-04 02:42:47.661263 7f3316ffd700 -1
>> librbd::object_map::InvalidateRequest: 0x7f3310002570 should_complete: r=0
>>
>> Stefan
>>
>>
>> Am 04.05.2017 um 07:49 schrieb Stefan Priebe - Profihost AG:
>>> Hello,
>>>
>>> since we've upgraded from hammer to jewel 10.2.7 and enabled
>>> exclusive-lock,object-map,fast-diff we've problems with corrupting VM
>>> filesystems.
>>>
>>> Sometimes the VMs are just crashing with FS errors and a restart can
>>> solve the problem. Sometimes the whole VM is not even bootable and we
>>> need to import a backup.
>>>
>>> All of them have the same problem that you can't revert to an older
>>> snapshot. The rbd command just hangs at 99% forever.
>>>
>>> Is this a known issue - anythink we can check?
>>>
>>> Greets,
>>> Stefan
>>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-04 Thread Stefan Priebe - Profihost AG
There are no watchers involved:
# rbd status cephstor2/vm-136-disk-1
Watchers: none

Greets,
Stefan

Am 04.05.2017 um 09:45 schrieb Stefan Priebe - Profihost AG:
> Example:
> # rbd rm cephstor2/vm-136-disk-1
> Removing image: 99% complete...
> 
> Stuck at 99% and never completes. This is an image which got corrupted
> for an unknown reason.
> 
> Greets,
> Stefan
> 
> Am 04.05.2017 um 08:32 schrieb Stefan Priebe - Profihost AG:
>> I'm not sure whether this is related but our backup system uses rbd
>> snapshots and reports sometimes messages like these:
>> 2017-05-04 02:42:47.661263 7f3316ffd700 -1
>> librbd::object_map::InvalidateRequest: 0x7f3310002570 should_complete: r=0
>>
>> Stefan
>>
>>
>> Am 04.05.2017 um 07:49 schrieb Stefan Priebe - Profihost AG:
>>> Hello,
>>>
>>> since we've upgraded from hammer to jewel 10.2.7 and enabled
>>> exclusive-lock,object-map,fast-diff we've problems with corrupting VM
>>> filesystems.
>>>
>>> Sometimes the VMs are just crashing with FS errors and a restart can
>>> solve the problem. Sometimes the whole VM is not even bootable and we
>>> need to import a backup.
>>>
>>> All of them have the same problem that you can't revert to an older
>>> snapshot. The rbd command just hangs at 99% forever.
>>>
>>> Is this a known issue - anythink we can check?
>>>
>>> Greets,
>>> Stefan
>>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-04 Thread Stefan Priebe - Profihost AG
Example:
# rbd rm cephstor2/vm-136-disk-1
Removing image: 99% complete...

Stuck at 99% and never completes. This is an image which got corrupted
for an unknown reason.

Greets,
Stefan

Am 04.05.2017 um 08:32 schrieb Stefan Priebe - Profihost AG:
> I'm not sure whether this is related but our backup system uses rbd
> snapshots and reports sometimes messages like these:
> 2017-05-04 02:42:47.661263 7f3316ffd700 -1
> librbd::object_map::InvalidateRequest: 0x7f3310002570 should_complete: r=0
> 
> Stefan
> 
> 
> Am 04.05.2017 um 07:49 schrieb Stefan Priebe - Profihost AG:
>> Hello,
>>
>> since we've upgraded from hammer to jewel 10.2.7 and enabled
>> exclusive-lock,object-map,fast-diff we've problems with corrupting VM
>> filesystems.
>>
>> Sometimes the VMs are just crashing with FS errors and a restart can
>> solve the problem. Sometimes the whole VM is not even bootable and we
>> need to import a backup.
>>
>> All of them have the same problem that you can't revert to an older
>> snapshot. The rbd command just hangs at 99% forever.
>>
>> Is this a known issue - anythink we can check?
>>
>> Greets,
>> Stefan
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-04 Thread Stefan Priebe - Profihost AG
I'm not sure whether this is related but our backup system uses rbd
snapshots and reports sometimes messages like these:
2017-05-04 02:42:47.661263 7f3316ffd700 -1
librbd::object_map::InvalidateRequest: 0x7f3310002570 should_complete: r=0

Stefan


Am 04.05.2017 um 07:49 schrieb Stefan Priebe - Profihost AG:
> Hello,
> 
> since we've upgraded from hammer to jewel 10.2.7 and enabled
> exclusive-lock,object-map,fast-diff we've problems with corrupting VM
> filesystems.
> 
> Sometimes the VMs are just crashing with FS errors and a restart can
> solve the problem. Sometimes the whole VM is not even bootable and we
> need to import a backup.
> 
> All of them have the same problem that you can't revert to an older
> snapshot. The rbd command just hangs at 99% forever.
> 
> Is this a known issue - anythink we can check?
> 
> Greets,
> Stefan
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] corrupted rbd filesystems since jewel

2017-05-03 Thread Stefan Priebe - Profihost AG
Hello,

since we've upgraded from hammer to jewel 10.2.7 and enabled
exclusive-lock,object-map,fast-diff we've problems with corrupting VM
filesystems.

Sometimes the VMs are just crashing with FS errors and a restart can
solve the problem. Sometimes the whole VM is not even bootable and we
need to import a backup.

All of them have the same problem that you can't revert to an older
snapshot. The rbd command just hangs at 99% forever.

Is this a known issue - anythink we can check?

Greets,
Stefan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com