On 08/11/2012 10:16 PM, Harry Mangalam wrote:
On Sat, Aug 11, 2012 at 9:41 AM, Brian Candler b.cand...@pobox.com
Maybe worth trying an strace (strace -f -p pid 2strace.out) on the
glusterfsd process, or whatever it is which is causing the high
load, during
such a burst,
On Sat, Aug 11, 2012 at 12:11:39PM +0100, Nux! wrote:
On 10.08.2012 22:16, Harry Mangalam wrote:
pbs3:/dev/md127 8.2T 5.9T 2.3T 73% /bducgl ---
Harry,
The name of that md device (127) indicated there may be something
dodgy going on there. A device shouldn't be named 127 unless some
Thanks for your comments.
I use mdadm on many servers and I've seen md numbering like this a fair
bit. Usually it occurs after a another RAID has been created and the
numbering shifts. Neil Brown (mdadm's author) , seems to think it's fine.
So I don't think that's the problem. And you're right
On Sat, Aug 11, 2012 at 08:31:51AM -0700, Harry Mangalam wrote:
Re the size difference, I'll explicitly rebalance the brick after the
fix-layout finishes, but I'm even more worried about this fantastic
increase in CPU usage and its effect on user performance.
This presumably means you
On Sat, Aug 11, 2012 at 9:41 AM, Brian Candler b.cand...@pobox.com wrote:
On Sat, Aug 11, 2012 at 08:31:51AM -0700, Harry Mangalam wrote:
Re the size difference, I'll explicitly rebalance the brick after the
fix-layout finishes, but I'm even more worried about this fantastic
Check your client logs. I have seen that with network issues causing
disconnects.
Harry Mangalam hjmanga...@gmail.com wrote:
Thanks for your comments.
I use mdadm on many servers and I've seen md numbering like this a fair
bit. Usually it occurs after a another RAID has been created and the
running 3.3 distributed on IPoIB on 4 nodes, 1 brick per node. Any idea
why, on one of those nodes, glusterfsd would go berserk, running up to 370%
CPU and driving load to 30 (file performance on the clients slows to a
crawl). While very slow, it continued to serve out files. This is the
second