librbd bug?

2013-03-07 Thread Wolfgang Hennerbichler
Hi, I've a libvirt-VM that gets format 2 rbd-childs 'fed' by the superhost. It crashed recently with this in the logs: osdc/ObjectCacher.cc: In function 'void ObjectCacher::bh_write_commit(int64_t, sobject_t, loff_t, uint64_t, tid_t, int)' thread 7f0cab5fd700 time 2013-03-01 22:02:37.374410

[PATCH 1/2] ceph: increase i_release_count when clear I_COMPLETE flag

2013-03-07 Thread Yan, Zheng
From: Yan, Zheng zheng.z@intel.com If some dentries were pruned or FILE_SHARED cap was revoked while readdir is in progress. make sure ceph_readdir() does not mark the directory as complete. Signed-off-by: Yan, Zheng zheng.z@intel.com --- fs/ceph/caps.c | 1 + fs/ceph/dir.c | 13

[PATCH 2/2] fs: fix dentry_lru_prune()

2013-03-07 Thread Yan, Zheng
From: Yan, Zheng zheng.z@intel.com dentry_lru_prune() should always call file system's d_prune callback. Signed-off-by: Yan, Zheng zheng.z@intel.com --- fs/dcache.c | 11 +++ 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index

Re: CephFS First product release discussion

2013-03-07 Thread Jimmy Tang
On 5 Mar 2013, at 17:03, Greg Farnum wrote: This is a companion discussion to the blog post at http://ceph.com/dev-notes/cephfs-mds-status-discussion/ — go read that! The short and slightly alternate version: I spent most of about two weeks working on bugs related to snapshots in the

Re: CephFS Space Accounting and Quotas

2013-03-07 Thread Jim Schutt
On 03/06/2013 05:18 PM, Greg Farnum wrote: On Wednesday, March 6, 2013 at 3:14 PM, Jim Schutt wrote: When I'm doing these stat operations the file system is otherwise idle. What's the cluster look like? This is just one active MDS and a couple hundred clients? 1 mds, 1 mon, 576 osds, 198

Re: librbd bug?

2013-03-07 Thread Sage Weil
On Thu, 7 Mar 2013, Wolfgang Hennerbichler wrote: Hi, I've a libvirt-VM that gets format 2 rbd-childs 'fed' by the superhost. It crashed recently with this in the logs: osdc/ObjectCacher.cc: In function 'void ObjectCacher::bh_write_commit(int64_t, sobject_t, loff_t, uint64_t, tid_t,

Re: MDS running at 100% CPU, no clients

2013-03-07 Thread Greg Farnum
This isn't bringing up anything in my brain, but I don't know what that _sample() function is actually doing — did you get any farther into it? -Greg On Wednesday, March 6, 2013 at 6:23 PM, Noah Watkins wrote: Which, looks to be in a tight loop in the memory model _sample… (gdb) bt #0

Re: MDS running at 100% CPU, no clients

2013-03-07 Thread Noah Watkins
On Mar 7, 2013, at 9:24 AM, Greg Farnum g...@inktank.com wrote: This isn't bringing up anything in my brain, but I don't know what that _sample() function is actually doing — did you get any farther into it? _sample reads /proc/self/maps in a loop until eof or some other conditions. i

Re: stuff for v0.56.4

2013-03-07 Thread Yehuda Sadeh
On Tue, Mar 5, 2013 at 3:10 PM, Sage Weil s...@inktank.com wrote: There have been a few important bug fixes that people are hitting or want: - the journal replay bug (5d54ab154ca790688a6a1a2ad5f869c17a23980a) - the - _ pool name vs cap parsing thing that is biting openstack users -

Is linux 3.8.2 up to date with all ceph patches?

2013-03-07 Thread Nick Bartos
I'm looking at upgrading to 3.8.2 from 3.5.7 with patches, and I just wanted to make sure that there weren't any additional ceph fixes that should be applied to 3.8.2. -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More

changes to rados command

2013-03-07 Thread Andrew Hume
in order to make the rados command more useful in scripts, i'd like to make a change, specifically change to rados -p pool getomapval obj key [fmt] where fmt is an optional formatting parameter. i've implemented 'str' which will print the value as an unadorned string. what is the process for

Re: [ceph-users] Using different storage types on same osd hosts?

2013-03-07 Thread Stefan Priebe
Am 06.03.2013 09:58, schrieb Martin B Nielsen: Hi, We did the opposite here; adding some SSD in free slots after having a normal cluster running with SATA. Thanks for your answer. Why did you do this? Was it to slow with SATA? Stefan -- To unsubscribe from this list: send the line unsubscribe

Re: changes to rados command

2013-03-07 Thread Greg Farnum
On Thursday, March 7, 2013 at 11:25 AM, Andrew Hume wrote: in order to make the rados command more useful in scripts, i'd like to make a change, specifically change to rados -p pool getomapval obj key [fmt] where fmt is an optional formatting parameter. i've implemented 'str' which will

Re: changes to rados command

2013-03-07 Thread Greg Farnum
(Re-added the list for future reference) Well, you'll need to learn how to use git at a basic level in order to be able to work effectively on Ceph (or most other open-source projects). Some links that might be helpful: http://www.joelonsoftware.com/items/2010/03/17.html

Re: [PATCH 1/2] ceph: increase i_release_count when clear I_COMPLETE flag

2013-03-07 Thread Greg Farnum
I'm pulling this in for now to make sure this clears out that ENOENT bug we hit — but shouldn't we be fixing ceph_i_clear() to always bump the i_release_count? It doesn't seem like it would ever be correct without it, and these are the only two callers. The second one looks good to us and

Re: stuff for v0.56.4

2013-03-07 Thread Bryan K. Wright
s...@inktank.com said: - pg log trimming (probably a conservative subset) to avoid memory bloat Anything that reduces the size of OSD processes would be appreciated. Bryan --

Re: Is linux 3.8.2 up to date with all ceph patches?

2013-03-07 Thread Sage Weil
On Thu, 7 Mar 2013, Nick Bartos wrote: I'm looking at upgrading to 3.8.2 from 3.5.7 with patches, and I just wanted to make sure that there weren't any additional ceph fixes that should be applied to 3.8.2. Nothing that I'm aware of. Alex? sage -- To unsubscribe from this list: send the line

Re: stuff for v0.56.4

2013-03-07 Thread Sage Weil
On Thu, 7 Mar 2013, Bryan K. Wright wrote: s...@inktank.com said: - pg log trimming (probably a conservative subset) to avoid memory bloat Anything that reduces the size of OSD processes would be appreciated. You can probably do this with just log max recent = 1000 By default it's

Re: OpenStack summit : Ceph design session

2013-03-07 Thread Loic Dachary
Hi Yehuda, I'm not sure if one keystone for all zones would be better than one keystone per zone. If you think it's worth discussing during the OpenStack summit and you create a session http://summit.openstack.org/cfp/create in the keystone track, I will definitely attend :-). Or I can do it

Re: Is linux 3.8.2 up to date with all ceph patches?

2013-03-07 Thread Alex Elder
On 03/07/2013 03:23 PM, Sage Weil wrote: On Thu, 7 Mar 2013, Nick Bartos wrote: I'm looking at upgrading to 3.8.2 from 3.5.7 with patches, and I just wanted to make sure that there weren't any additional ceph fixes that should be applied to 3.8.2. Nothing that I'm aware of. Alex? I will

Re: [PATCH 1/2] ceph: increase i_release_count when clear I_COMPLETE flag

2013-03-07 Thread Yan, Zheng
On Fri, Mar 8, 2013 at 5:03 AM, Greg Farnum g...@inktank.com wrote: I'm pulling this in for now to make sure this clears out that ENOENT bug we hit — but shouldn't we be fixing ceph_i_clear() to always bump the i_release_count? It doesn't seem like it would ever be correct without it, and

Re: [PATCH 2/2] fs: fix dentry_lru_prune()

2013-03-07 Thread Dave Chinner
On Thu, Mar 07, 2013 at 07:37:36PM +0800, Yan, Zheng wrote: From: Yan, Zheng zheng.z@intel.com dentry_lru_prune() should always call file system's d_prune callback. Why? What bug does this fix? Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send

Re: [PATCH 2/2] fs: fix dentry_lru_prune()

2013-03-07 Thread Yan, Zheng
On 03/08/2013 10:04 AM, Dave Chinner wrote: On Thu, Mar 07, 2013 at 07:37:36PM +0800, Yan, Zheng wrote: From: Yan, Zheng zheng.z@intel.com dentry_lru_prune() should always call file system's d_prune callback. Why? What bug does this fix? Ceph uses a flag to track if the dcache

Re: Mon losing touch with OSDs

2013-03-07 Thread Chris Dunlop
On Thu, Feb 28, 2013 at 09:00:24PM -0800, Sage Weil wrote: On Fri, 1 Mar 2013, Chris Dunlop wrote: On Sat, Feb 23, 2013 at 01:02:53PM +1100, Chris Dunlop wrote: On Fri, Feb 22, 2013 at 05:52:11PM -0800, Sage Weil wrote: On Sat, 23 Feb 2013, Chris Dunlop wrote: On Fri, Feb 22, 2013 at

Re: librbd bug?

2013-03-07 Thread Dan Mick
On 03/07/2013 02:16 AM, Wolfgang Hennerbichler wrote: Hi, I've a libvirt-VM that gets format 2 rbd-childs 'fed' by the superhost. It crashed recently with this in the logs: osdc/ObjectCacher.cc: In function 'void ObjectCacher::bh_write_commit(int64_t, sobject_t, loff_t, uint64_t, tid_t,

Re: librbd bug?

2013-03-07 Thread Sage Weil
On Thu, 7 Mar 2013, Dan Mick wrote: On 03/07/2013 02:16 AM, Wolfgang Hennerbichler wrote: Hi, I've a libvirt-VM that gets format 2 rbd-childs 'fed' by the superhost. It crashed recently with this in the logs: osdc/ObjectCacher.cc: In function 'void

Re: [PATCH 2/2] fs: fix dentry_lru_prune()

2013-03-07 Thread Dave Chinner
On Fri, Mar 08, 2013 at 10:43:00AM +0800, Yan, Zheng wrote: On 03/08/2013 10:04 AM, Dave Chinner wrote: On Thu, Mar 07, 2013 at 07:37:36PM +0800, Yan, Zheng wrote: From: Yan, Zheng zheng.z@intel.com dentry_lru_prune() should always call file system's d_prune callback. Why? What

Re: [PATCH 2/2] fs: fix dentry_lru_prune()

2013-03-07 Thread Yan, Zheng
On 03/08/2013 02:27 PM, Dave Chinner wrote: On Fri, Mar 08, 2013 at 10:43:00AM +0800, Yan, Zheng wrote: On 03/08/2013 10:04 AM, Dave Chinner wrote: On Thu, Mar 07, 2013 at 07:37:36PM +0800, Yan, Zheng wrote: From: Yan, Zheng zheng.z@intel.com dentry_lru_prune() should always call file