Re: [ceph-users] Potential OSD deadlock?

2015-10-06 Thread Sage Weil
On Tue, 6 Oct 2015, Robert LeBlanc wrote: > Thanks for your time Sage. It sounds like a few people may be helped if you > can find something. > > I did a recursive chown as in the instructions (although I didn't know about > the doc at the time). I did an osd debug at 20/20 but didn't see

CephFS and the next hammer release v0.94.4

2015-10-06 Thread Loic Dachary
Hi Greg, The next hammer release as found at https://github.com/ceph/ceph/tree/hammer passed the fs suite (http://tracker.ceph.com/issues/12701#note-66). Do you think the hammer branch is ready for QE to start their own round of testing ? Cheers P.S.

Re: [ceph-users] Potential OSD deadlock?

2015-10-06 Thread Sage Weil
On Tue, 6 Oct 2015, Robert LeBlanc wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > I can't think of anything. In my dev cluster the only thing that has > changed is the Ceph versions (no reboot). What I like is even though > the disks are 100% utilized, it is preforming as I expect

Re: [ceph-users] Potential OSD deadlock?

2015-10-06 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 This was from the monitor (can't bring it up with Hammer now, complete cluster is down, this is only my lab, so no urgency). I got it up and running this way: 1. Upgrade the mon node to Infernalis and started the mon. 2. Downgraded the OSDs to

Re: [ceph-users] Potential OSD deadlock?

2015-10-06 Thread Sage Weil
On Tue, 6 Oct 2015, Robert LeBlanc wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > OK, an interesting point. Running ceph version 9.0.3-2036-g4f54a0d > (4f54a0dd7c4a5c8bdc788c8b7f58048b2a28b9be) looks a lot better. I got > messages when the OSD was marked out: > > 2015-10-06

Re: [ceph-users] Potential OSD deadlock?

2015-10-06 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 OK, an interesting point. Running ceph version 9.0.3-2036-g4f54a0d (4f54a0dd7c4a5c8bdc788c8b7f58048b2a28b9be) looks a lot better. I got messages when the OSD was marked out: 2015-10-06 11:52:46.961040 osd.13 192.168.55.12:6800/20870 81 : cluster

Re: Minor error in the firefly release notes

2015-10-06 Thread Patrick McGarry
Thanks for pointing this out. Probably a good thing to have on the list for broader note though. Guessing Sage or someone is better equipped to make changes to the release notes. On Tue, Oct 6, 2015 at 2:09 PM, Dr. Christopher Kunz wrote: > Hey, > >

Re: [ceph-users] Potential OSD deadlock?

2015-10-06 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 I'll capture another set of logs. Is there any other debugging you want turned up? I've seen the same thing where I see the message dispatched to the secondary OSD, but the message just doesn't show up for 30+ seconds in the secondary OSD logs. -

Re: [ceph-users] Potential OSD deadlock?

2015-10-06 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On my second test (a much longer one), it took nearly an hour, but a few messages have popped up over a 20 window. Still far less than I have been seeing. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2

[no subject]

2015-10-06 Thread Aakanksha Pudipeddi-SSI
subscribe ceph-devel -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

radosgw with openstack auth v3

2015-10-06 Thread Luis Periquito
Hi, I was trying to get radosgw to authenticate using API v3, as I thought it would be relatively straightforward, but my C++ is not up to standard. That and the time it takes to compile and make the radosgw binary is way too long. Is there a way to just compile radosgw so I can make the tests

Re: Minor error in the firefly release notes

2015-10-06 Thread Sage Weil
Fixed in 5e9cf8e On Tue, 6 Oct 2015, Patrick McGarry wrote: > Thanks for pointing this out. Probably a good thing to have on the > list for broader note though. Guessing Sage or someone is better > equipped to make changes to the release notes. > > > On Tue, Oct 6, 2015 at 2:09 PM, Dr.

Re: [ceph-users] Potential OSD deadlock?

2015-10-06 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 I can't think of anything. In my dev cluster the only thing that has changed is the Ceph versions (no reboot). What I like is even though the disks are 100% utilized, it is preforming as I expect now. Client I/O is slightly degraded during the

Re: [ceph-users] Potential OSD deadlock?

2015-10-06 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 I upped the debug on about everything and ran the test for about 40 minutes. I took OSD.19 on ceph1 doen and then brought it back in. There was at least one op on osd.19 that was blocked for over 1,000 seconds. Hopefully this will have something

Re: [ceph-users] Potential OSD deadlock?

2015-10-06 Thread Sage Weil
On Mon, 5 Oct 2015, Robert LeBlanc wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > With some off-list help, we have adjusted > osd_client_message_cap=1. This seems to have helped a bit and we > have seen some OSDs have a value up to 4,000 for client messages. But > it does not

Re: [ceph-users] Potential OSD deadlock?

2015-10-06 Thread Sage Weil
On Tue, 6 Oct 2015, Ken Dreyer wrote: > On Tue, Oct 6, 2015 at 8:38 AM, Sage Weil wrote: > > Oh.. I bet you didn't upgrade the osds to 0.94.4 (or latest hammer build) > > first. They won't be allowed to boot until that happens... all upgrades > > must stop at 0.94.4 first. > >

Re: rados and the next hammer release v0.94.4

2015-10-06 Thread Samuel Just
The failure labeled 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-osd -f -i 5' is a bit odd, the osd crashed due to 0> 2015-10-03 05:21:10.776554 7fce619f0700 -1 common/HeartbeatMap.cc: In function 'bool

Re: [ceph-users] Potential OSD deadlock?

2015-10-06 Thread Ken Dreyer
On Tue, Oct 6, 2015 at 8:38 AM, Sage Weil wrote: > Oh.. I bet you didn't upgrade the osds to 0.94.4 (or latest hammer build) > first. They won't be allowed to boot until that happens... all upgrades > must stop at 0.94.4 first. This sounds pretty crucial. is there Redmine

Re: [ceph-users] Potential OSD deadlock?

2015-10-06 Thread Sage Weil
On Tue, 6 Oct 2015, Robert LeBlanc wrote: > I downgraded to the hammer gitbuilder branch, but it looks like I've > passed the point of no return: > > 2015-10-06 09:44:52.210873 7fd3dd8b78c0 -1 ERROR: on disk data > includes unsupported features: > compat={},rocompat={},incompat={7=support shec

Re: [ceph-users] Potential OSD deadlock?

2015-10-06 Thread Robert LeBlanc
I downgraded to the hammer gitbuilder branch, but it looks like I've passed the point of no return: 2015-10-06 09:44:52.210873 7fd3dd8b78c0 -1 ERROR: on disk data includes unsupported features: compat={},rocompat={},incompat={7=support shec erasure code} 2015-10-06 09:44:52.210922 7fd3dd8b78c0 -1