[Gluster-users] Capturing config in a file

2014-03-31 Thread Steve Thomas
Hi,

Can anyone tell me how I can capture the gluster brick config in a file ? We're 
running RHN Satellite and I'd like to be able to push a config file out to any 
new brick servers and also store for existing servers.

I'm running 3.4.2 and wondered where and which files would be necessary to 
capture all of glusters configuration?

Thanks,
Steve




___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster 3.4.2 on Redhat 6.5

2014-03-26 Thread Steve Thomas
Hi All,

I've got to the bottom of it By running glusterd in foreground with debug 
enabled I was able to see two error messages when the command was being run... 
it appears that it was requiring the xfsprogs package which I did not have 
installed. Once I installed it it appears that zombie processes are no longer 
being created.

Cheers,

Steve

From: Carlos Capriotti [mailto:capriotti.car...@gmail.com]
Sent: 25 March 2014 12:30
To: Steve Thomas
Cc: gluster-users@gluster.org
Subject: Re: [Gluster-users] Gluster 3.4.2 on Redhat 6.5

Steve:

Tested that myself - not the nagios part, but the gluster commands you posted 
later - and no errors or zombies.

Somebody else reported the same, so, sounds consistent.

There must be another process there biting your gluster, turning it into a 
haunted scenario.

Cheers,

Carlos

On Thu, Mar 20, 2014 at 12:19 PM, Steve Thomas 
stho...@rpstechnologysolutions.co.ukmailto:stho...@rpstechnologysolutions.co.uk
 wrote:
Hi,

I'm running Gluster 3.4.2 on Redhat 6.5 with 4 servers with a brick on each. 
This brick is mounted locally and used by apache to server audio files for an 
IVR system. Each of these audio files are typically around 80-100Kb.

System appears to be working ok in terms of health and status via gluster CLI.

The system is monitored by nagios and there's a check for zombie processes and 
the gluster status. It appears that over a 24 hour period the number of Zombie 
processes on the box has increased and is continually increasing. Investigating 
these are glusterd processes.

I'm making an assumption but I'd suspect that the regular nagios checks are 
resulting in the increase in zombie processes as they are querying the glusterd 
process. The command that the nagios plugin is running is:

#Check heal status
gluster volume heal audio info

#Check volume status
gluster volume status audio detail

Does anyone have any suggestions as to why glusterd is resulting in these 
zombie processes?

Thanks for help in advance,

Steve



___
Gluster-users mailing list
Gluster-users@gluster.orgmailto:Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster 3.4.2 on Redhat 6.5

2014-03-24 Thread Steve Thomas
Hi Carlos,

Thanks for coming back to me... in response to your queries:

PID is low, 1153 for glusterd with glusterfsd 1168 and 2 x glusterfs with 1318 
and 1319 so I'd agree... it doesn't seem that glusterd is crashing and being 
restarted.

As of today, Monday morning top is reporting 1398 glusterd zombie processes.

I have this problem on all 4 of my gluster nodes and all four are being 
monitored by the attached nagios plugin.

In terms of testing, I've prevented nagios from running the attached check 
script and restarted the glusterd process using
service glusterd restart. I've let it run for a few hours and haven't yet 
seen any zombie processes created. This I think is good as, for whatever 
reason, it appears to point at the nagios check script being the problem.

My next check was to run the nagios check once to see if it created a Zombie 
process... it did So I started looking at the script. I forced the script 
to exit after the first command gluster volume heal audio info and no Zombie 
process was created. This pointed me to the second which takes this form 
I'm no expert of HERE documents in shell but I think that it maybe causing the 
issue:
while read -r line; do
 field=($(echo $line))
 case ${field[0]} in
 Brick)
   brick=${field[@]:2}
   ;;
 Disk)
   key=${field[@]:0:3}
   if [ ${key} = Disk Space Free ]; then
freeunit=${field[@]:4}
unit=${freeunit: -2}
free=${freeunit%$unit}
if [ $unit != GB ]; then
 Exit UNKNOWN Unknown disk space size $freeunit\n
fi
if (( $(bc  ${free}  ${freegb}) == 1 )); then
 freegb=$free
fi
   fi
   ;;
 Online)
   online=${field[@]:2}
   if [ ${online} = Y ]; then
let $((bricksfound++))
   else
errors=(${errors[@]} $brick offline)
   fi
   ;;
 esac
done  ( sudo gluster volume status ${VOLUME} detail)


Anyone spot why this would be an issue?

Thanks,
Steve


From: Carlos Capriotti [mailto:capriotti.car...@gmail.com]
Sent: 22 March 2014 11:51
To: Steve Thomas
Cc: gluster-users@gluster.org
Subject: Re: [Gluster-users] Gluster 3.4.2 on Redhat 6.5

ok, let's see if we can gather more info.

I am not a specialist, but you know... another pair of eyes.

My system has a single glusterd process and it has a pretty low PID, meaning it 
has not crashed.

What is your PID for your glusterd ? how many zombie processes are there 
reported by top ?

I've been running my preliminary tests with gluster for a little over a month 
now and have never seen this. My platform is CentOS 6.5, so, I'd say it is 
pretty similar.

From my perspective, even making gluster sweat, running some intense rsync 
jobs in parallel, and seeing glusterd AND glusterfs take 120% of processing 
time on top (each on one core), they never crashed.

My zombie count, from top,  is zero.

On the other hand, I had one of my nodes, the other day, crashing a process 
every time I started a high demanding task. Ends up I had (and still have) a 
hardware problem on one of the processor (or the main board; still undiagnosed).

Do you have this problem on one node only ?

Any chance you have something special compiled on your kernel ?

Any particularly memory-hungry tweak on your sysctl ?

Sounds like the system, not gluster.

KR,

Carlos



On Fri, Mar 21, 2014 at 10:29 PM, Steve Thomas 
stho...@rpstechnologysolutions.co.ukmailto:stho...@rpstechnologysolutions.co.uk
 wrote:
Hi all...

Further investigation shows in excess of 500 glusterd zombie processes and 
continuing to climb on the box ...

Any suggestions? Am happy to provide logs etc to get to the bottom of this

_
From: Steve Thomas
Sent: 21 March 2014 13:21
To: 'gluster-users@gluster.orgmailto:gluster-users@gluster.org'
Subject: Gluster 3.4.2 on Redhat 6.5


Hi,

I'm running Gluster 3.4.2 on Redhat 6.5 with 4 servers with a brick on each. 
This brick is mounted locally and used by apache to server audio files for an 
IVR system. Each of these audio files are typically around 80-100Kb.

System appears to be working ok in terms of health and status via gluster CLI.

The system is monitored by nagios and there's a check for zombie processes and 
the gluster status. It appears that over a 24 hour period the number of Zombie 
processes on the box has increased and is continually increasing. Investigating 
these are glusterd processes.

I'm making an assumption but I'd suspect that the regular nagios checks are 
resulting in the increase in zombie processes as they are querying the glusterd 
process. The command that the nagios plugin is running is:

#Check heal status
gluster volume heal audio info

#Check volume status
gluster volume status audio detail

Does anyone have any suggestions as to why glusterd

Re: [Gluster-users] Gluster 3.4.2 on Redhat 6.5

2014-03-24 Thread Steve Thomas
Some further information:

When I run the command
gluster volume status audio detail
I get the Zombie process created So it's not the HERE document as I 
previously thought... it's the command itself.

Does this happen with anyone else?

Thanks,
Steve


From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Steve Thomas
Sent: 24 March 2014 11:55
To: Carlos Capriotti
Cc: gluster-users@gluster.org
Subject: Re: [Gluster-users] Gluster 3.4.2 on Redhat 6.5

Hi Carlos,

Thanks for coming back to me... in response to your queries:

PID is low, 1153 for glusterd with glusterfsd 1168 and 2 x glusterfs with 1318 
and 1319 so I'd agree... it doesn't seem that glusterd is crashing and being 
restarted.

As of today, Monday morning top is reporting 1398 glusterd zombie processes.

I have this problem on all 4 of my gluster nodes and all four are being 
monitored by the attached nagios plugin.

In terms of testing, I've prevented nagios from running the attached check 
script and restarted the glusterd process using
service glusterd restart. I've let it run for a few hours and haven't yet 
seen any zombie processes created. This I think is good as, for whatever 
reason, it appears to point at the nagios check script being the problem.

My next check was to run the nagios check once to see if it created a Zombie 
process... it did So I started looking at the script. I forced the script 
to exit after the first command gluster volume heal audio info and no Zombie 
process was created. This pointed me to the second which takes this form 
I'm no expert of HERE documents in shell but I think that it maybe causing the 
issue:
while read -r line; do
 field=($(echo $line))
 case ${field[0]} in
 Brick)
   brick=${field[@]:2}
   ;;
 Disk)
   key=${field[@]:0:3}
   if [ ${key} = Disk Space Free ]; then
freeunit=${field[@]:4}
unit=${freeunit: -2}
free=${freeunit%$unit}
if [ $unit != GB ]; then
 Exit UNKNOWN Unknown disk space size $freeunit\n
fi
if (( $(bc  ${free}  ${freegb}) == 1 )); then
 freegb=$free
fi
   fi
   ;;
 Online)
   online=${field[@]:2}
   if [ ${online} = Y ]; then
let $((bricksfound++))
   else
errors=(${errors[@]} $brick offline)
   fi
   ;;
 esac
done  ( sudo gluster volume status ${VOLUME} detail)


Anyone spot why this would be an issue?

Thanks,
Steve


From: Carlos Capriotti [mailto:capriotti.car...@gmail.com]
Sent: 22 March 2014 11:51
To: Steve Thomas
Cc: gluster-users@gluster.orgmailto:gluster-users@gluster.org
Subject: Re: [Gluster-users] Gluster 3.4.2 on Redhat 6.5

ok, let's see if we can gather more info.

I am not a specialist, but you know... another pair of eyes.

My system has a single glusterd process and it has a pretty low PID, meaning it 
has not crashed.

What is your PID for your glusterd ? how many zombie processes are there 
reported by top ?

I've been running my preliminary tests with gluster for a little over a month 
now and have never seen this. My platform is CentOS 6.5, so, I'd say it is 
pretty similar.

From my perspective, even making gluster sweat, running some intense rsync 
jobs in parallel, and seeing glusterd AND glusterfs take 120% of processing 
time on top (each on one core), they never crashed.

My zombie count, from top,  is zero.

On the other hand, I had one of my nodes, the other day, crashing a process 
every time I started a high demanding task. Ends up I had (and still have) a 
hardware problem on one of the processor (or the main board; still undiagnosed).

Do you have this problem on one node only ?

Any chance you have something special compiled on your kernel ?

Any particularly memory-hungry tweak on your sysctl ?

Sounds like the system, not gluster.

KR,

Carlos



On Fri, Mar 21, 2014 at 10:29 PM, Steve Thomas 
stho...@rpstechnologysolutions.co.ukmailto:stho...@rpstechnologysolutions.co.uk
 wrote:
Hi all...

Further investigation shows in excess of 500 glusterd zombie processes and 
continuing to climb on the box ...

Any suggestions? Am happy to provide logs etc to get to the bottom of this

_
From: Steve Thomas
Sent: 21 March 2014 13:21
To: 'gluster-users@gluster.orgmailto:gluster-users@gluster.org'
Subject: Gluster 3.4.2 on Redhat 6.5


Hi,

I'm running Gluster 3.4.2 on Redhat 6.5 with 4 servers with a brick on each. 
This brick is mounted locally and used by apache to server audio files for an 
IVR system. Each of these audio files are typically around 80-100Kb.

System appears to be working ok in terms of health and status via gluster CLI.

The system is monitored by nagios and there's a check for zombie processes and 
the gluster status. It appears that over

[Gluster-users] Gluster 3.4.2 on Redhat 6.5

2014-03-21 Thread Steve Thomas
Hi,

I'm running Gluster 3.4.2 on Redhat 6.5 with 4 servers with a brick on each. 
This brick is mounted locally and used by apache to server audio files for an 
IVR system. Each of these audio files are typically around 80-100Kb.

System appears to be working ok in terms of health and status via gluster CLI.

The system is monitored by nagios and there's a check for zombie processes and 
the gluster status. It appears that over a 24 hour period the number of Zombie 
processes on the box has increased and is continually increasing. Investigating 
these are glusterd processes.

I'm making an assumption but I'd suspect that the regular nagios checks are 
resulting in the increase in zombie processes as they are querying the glusterd 
process. The command that the nagios plugin is running is:

#Check heal status
gluster volume heal audio info

#Check volume status
gluster volume status audio detail

Does anyone have any suggestions as to why glusterd is resulting in these 
zombie processes?

Thanks for help in advance,

Steve


___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster 3.4.2 on Redhat 6.5

2014-03-21 Thread Steve Thomas
Hi all...

Further investigation shows in excess of 500 glusterd zombie processes and 
continuing to climb on the box ...

Any suggestions? Am happy to provide logs etc to get to the bottom of this

_
From: Steve Thomas
Sent: 21 March 2014 13:21
To: 'gluster-users@gluster.org'
Subject: Gluster 3.4.2 on Redhat 6.5


Hi,

I'm running Gluster 3.4.2 on Redhat 6.5 with 4 servers with a brick on each. 
This brick is mounted locally and used by apache to server audio files for an 
IVR system. Each of these audio files are typically around 80-100Kb.

System appears to be working ok in terms of health and status via gluster CLI.

The system is monitored by nagios and there's a check for zombie processes and 
the gluster status. It appears that over a 24 hour period the number of Zombie 
processes on the box has increased and is continually increasing. Investigating 
these are glusterd processes.

I'm making an assumption but I'd suspect that the regular nagios checks are 
resulting in the increase in zombie processes as they are querying the glusterd 
process. The command that the nagios plugin is running is:

#Check heal status
gluster volume heal audio info

#Check volume status
gluster volume status audio detail

Does anyone have any suggestions as to why glusterd is resulting in these 
zombie processes?

Thanks for help in advance,

Steve


___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Gluster 3.4.2 on Redhat 6.5

2014-03-20 Thread Steve Thomas
Hi,

I'm running Gluster 3.4.2 on Redhat 6.5 with 4 servers with a brick on each. 
This brick is mounted locally and used by apache to server audio files for an 
IVR system. Each of these audio files are typically around 80-100Kb.

System appears to be working ok in terms of health and status via gluster CLI.

The system is monitored by nagios and there's a check for zombie processes and 
the gluster status. It appears that over a 24 hour period the number of Zombie 
processes on the box has increased and is continually increasing. Investigating 
these are glusterd processes.

I'm making an assumption but I'd suspect that the regular nagios checks are 
resulting in the increase in zombie processes as they are querying the glusterd 
process. The command that the nagios plugin is running is:

#Check heal status
gluster volume heal audio info

#Check volume status
gluster volume status audio detail

Does anyone have any suggestions as to why glusterd is resulting in these 
zombie processes?

Thanks for help in advance,

Steve


___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users