Re: [Gluster-devel] Problem with TLA ver 887

2009-02-10 Thread Anand Avati
Mickey,
 Can you check if the latest tla code has resolved the issues you faced?

Thanks,
Avati

On Sun, Feb 8, 2009 at 11:19 PM, Mickey Mazarick m...@digitaltadpole.com 
wrote:
 Heh our tests are kind of an unholy mess... but here's the part I think is
 useful:
 We use a startup script that will iterate through vol files and mount the
 first available file on the list. We have a bunch of vol files that test a
 few different server configurations. After mountpoints are prepared we have
 other scripts that start virtual machine on the various mounts.

  In other words I have a directory called /glustermounts/ and in that
 directory I have the files:
 main.vol  main.vol.ib  main.vol.tcp stripe.vol.ha stripe.vol.tcp

 after running /etc/init.d/glustersystem start  I will have the following
 mount points:
 /system (our default mount, we actually store the vol files here)
 /mnt/main
 /mnt/stripe

 The output shows me if any vol file failed to mount and it automatically
 attempts the next one (ex mounting main.vol failed, trying main.vol.ib).
 We simply arrange vol files from most features to least. We have a separate
 script which starts up a virtual machine on each test mount. This is the
 actual test we use as it creates symbolic links, uses mmaps etc but it's
 pretty specific to us. This closely mirrors how we use it in production.

 I've included out startup script and I would suggest you simply run
 something similar to your production on a few mounts in the same way we
 have. I may share this with the entire group although there are probably
 better init scripts out there. This one does kill all processes attached to
 a mount point which is useful. Let me know if you have any questions!

 Thanks!

 -Mickey Mazarick



 Geoff Kassel wrote:

 Hi,
As a fellow GlusterFS user, I was just wondering if you could point me to
 the regression tests you're using for GlusterFS?

I've looked high and low for the unit tests that the GlusterFS devs are
 meants to be using (ala http://www.gluster.org/docs/index.php/GlusterFS_QA)
 so that I can do my own testing, but I've not been able to find them.

If it's tests you've developed in-house, would you be interested in
 releasing them to the wider community?

 Kind regards,

 Geoff Kassel.

 On Thu, 5 Feb 2009, Mickey Mazarick wrote:


 I haven't done any full regression testing to see where the problem is
 but the later TLA versions are causeing out storage servers to spike to
 100% cpu usage and the clients never see any files. Our initial tests
 are with ibverbs/HA but no performance translators.

 Thanks!
 -Mickey Mazarick


 --

 #!/bin/sh
 # Startup script for gluster Mount system
 volFiles=/glustermounts/
 defaultcheckFile=customers
 speclist=/etc/glusterfs-system.vol.ibverbs /etc/glusterfs-system.vol.ha
 /etc/glusterfs-system.vol.ibverbs /etc/glusterfs-system.vol.tcp
 start() {
specfile=${1}
if [ $# -gt 1 ]; then
mountpt=${2}
else
mountpt=`echo ${specfile} |sed s#\.vol.*\\\$## |sed
 s#/.*/##`
mountpt=/mnt/${mountpt}
fi
logfile=`echo ${specfile} |sed s#\.vol.*\\\$## |sed s#/.*/##`
logfile=/var/${logfile}.log
pidfile=`echo ${specfile} |sed s#\.vol.*\\\$## |sed s#/.*/##`
pidfile=/var/run/${pidfile}.pid
echo mounting specfile:${specfile} at:${mountpt} with pid
 at:${pidfile}
currentpids=`pidof glusterfs`
currentpids=0 ${currentpids}
mountct=`mount |grep ${mountpt} |grep -c glusterfs`
if [ -f $pidfile ]; then
currentpid=`cat ${pidfile}`
pidct=`echo ${currentpids} |grep -c ${currentpid}`
if [ ${pidct} -eq 0 ]; then
rm -rf ${pidfile}
echo removing pid file: ${pidfile}
fi
if [ ${mountct} -lt 1 ]; then
echo Gluster System mount:${mountpt} died.
 Remounting.
stop ${mountpt} ${pidfile}
fi
else
rm -rf ${pidfile}
if [ ${mountct} -gt 0 ]; then
myupid=`ps -ef |grep /system |grep gluster |sed
 s#root\s*## |sed s#\s.*##`
if [ ${myupid} -gt 0 ]; then
   echo ${myupid}  ${pidfile}
else
   echo Gluster System mounted at:${mountpt} but
 with no pid. Remounting.
   stop ${mountpt} ${pidfile}
fi
fi
fi

if [ -e $pidfile ]; then
echo Gluster System Mount:${mountpt} is running with spec:
 ${specfile}
#echo Gluster System Mount:${mountpt} is running.
return 0
else
#rm -rf /var/glustersystemclient.log
modprobe fuse
sleep 1.5
#rm -rf /var/glustersystemclient.log
mkdir ${mountpt}
rm -rf $pidfile

Re: [Gluster-devel] Problem with TLA ver 887

2009-02-10 Thread Mickey Mazarick




We did a tla install of 906 yesterday and the problem seems to have
been resolved by that build.

Thanks and keep up the great work!
-Mic

Anand Avati wrote:

  Mickey,
 Can you check if the latest tla code has resolved the issues you faced?

Thanks,
Avati

On Sun, Feb 8, 2009 at 11:19 PM, Mickey Mazarick m...@digitaltadpole.com wrote:
  
  
Heh our tests are kind of an unholy mess... but here's the part I think is
useful:
We use a startup script that will iterate through vol files and mount the
first available file on the list. We have a bunch of vol files that test a
few different server configurations. After mountpoints are prepared we have
other scripts that start virtual machine on the various mounts.

 In other words I have a directory called "/glustermounts/" and in that
directory I have the files:
main.vol  main.vol.ib  main.vol.tcp stripe.vol.ha stripe.vol.tcp

after running "/etc/init.d/glustersystem start"  I will have the following
mount points:
/system (our default mount, we actually store the vol files here)
/mnt/main
/mnt/stripe

The output shows me if any vol file failed to mount and it automatically
attempts the next one (ex" "mounting main.vol failed, trying main.vol.ib").
We simply arrange vol files from most features to least. We have a separate
script which starts up a virtual machine on each test mount. This is the
actual "test" we use as it creates symbolic links, uses mmaps etc but it's
pretty specific to us. This closely mirrors how we use it in production.

I've included out startup script and I would suggest you simply run
something similar to your production on a few mounts in the same way we
have. I may share this with the entire group although there are probably
better init scripts out there. This one does kill all processes attached to
a mount point which is useful. Let me know if you have any questions!

Thanks!

-Mickey Mazarick



Geoff Kassel wrote:

Hi,
   As a fellow GlusterFS user, I was just wondering if you could point me to
the regression tests you're using for GlusterFS?

   I've looked high and low for the unit tests that the GlusterFS devs are
meants to be using (ala http://www.gluster.org/docs/index.php/GlusterFS_QA)
so that I can do my own testing, but I've not been able to find them.

   If it's tests you've developed in-house, would you be interested in
releasing them to the wider community?

Kind regards,

Geoff Kassel.

On Thu, 5 Feb 2009, Mickey Mazarick wrote:


I haven't done any full regression testing to see where the problem is
but the later TLA versions are causeing out storage servers to spike to
100% cpu usage and the clients never see any files. Our initial tests
are with ibverbs/HA but no performance translators.

Thanks!
-Mickey Mazarick


--

#!/bin/sh
# Startup script for gluster Mount system
volFiles="/glustermounts/"
defaultcheckFile="customers"
speclist="/etc/glusterfs-system.vol.ibverbs /etc/glusterfs-system.vol.ha
/etc/glusterfs-system.vol.ibverbs /etc/glusterfs-system.vol.tcp"
start() {
   specfile=${1}
   if [ "$#" -gt 1 ]; then
   mountpt=${2}
   else
   mountpt=`echo ${specfile} |sed "s#\.vol.*\\\$##" |sed
"s#/.*/##"`
   mountpt="/mnt/${mountpt}"
   fi
   logfile=`echo ${specfile} |sed "s#\.vol.*\\\$##" |sed "s#/.*/##"`
   logfile="/var/${logfile}.log"
   pidfile=`echo ${specfile} |sed "s#\.vol.*\\\$##" |sed "s#/.*/##"`
   pidfile="/var/run/${pidfile}.pid"
   echo "mounting specfile:${specfile} at:${mountpt} with pid
at:${pidfile}"
   currentpids=`pidof glusterfs`
   currentpids="0 ${currentpids}"
   mountct=`mount |grep ${mountpt} |grep -c glusterfs`
   if [ -f $pidfile ]; then
   currentpid=`cat ${pidfile}`
   pidct=`echo "${currentpids}" |grep -c ${currentpid}`
   if [ "${pidct}" -eq 0 ]; then
   rm -rf ${pidfile}
   echo "removing pid file: ${pidfile}"
   fi
   if [ "${mountct}" -lt 1 ]; then
   echo "Gluster System mount:${mountpt} died.
Remounting."
   stop ${mountpt} ${pidfile}
   fi
   else
   rm -rf ${pidfile}
   if [ "${mountct}" -gt 0 ]; then
   myupid=`ps -ef |grep /system |grep gluster |sed
"s#root\s*##" |sed "s#\s.*##"`
   if [ "${myupid}" -gt 0 ]; then
  echo "${myupid}"  ${pidfile}
   else
  echo "Gluster System mounted at:${mountpt} but
with no pid. Remounting."
  stop ${mountpt} ${pidfile}
   fi
   fi
   fi

   if [ -e $pidfile ]; then
   echo "Gluster System Mount:${mountpt} is running with spec:
${specfile}"
   #echo "Gluster System Mount:${mountpt} is running."
   return 0
   else
   #rm -rf /var/glustersystemclient.log
   

Re: [Gluster-devel] Problem with TLA ver 887

2009-02-08 Thread Mickey Mazarick




Heh our tests are kind of an unholy mess... but here's the part I think
is useful:
We use a startup script that will iterate through vol files and mount
the first available file on the list. We have a bunch of vol files that
test a few different server configurations. After mountpoints are
prepared we have other scripts that start virtual machine on the
various mounts.

In other words I have a directory called "/glustermounts/" and in that
directory I have the files:
main.vol main.vol.ib main.vol.tcp stripe.vol.ha stripe.vol.tcp

after running "/etc/init.d/glustersystem start" I will have the
following mount points: 
/system (our default mount, we actually store the vol files here)
/mnt/main
/mnt/stripe

The output shows me if any vol file failed to mount and it
automatically attempts the next one (ex" "mounting main.vol failed,
trying main.vol.ib"). We simply arrange vol files from most features to
least. We have a separate script which starts up a virtual machine on
each test mount. This is the actual "test" we use as it creates
symbolic links, uses mmaps etc but it's pretty specific to us. This
closely mirrors how we use it in production. 

I've included out startup script and I would suggest you simply run
something similar to your production on a few mounts in the same way we
have. I may share this with the entire group although there are
probably better init scripts out there. This one does kill all
processes attached to a mount point which is useful. Let me know if you
have any questions!

Thanks!

-Mickey Mazarick



Geoff Kassel wrote:

  Hi,
   As a fellow GlusterFS user, I was just wondering if you could point me to 
the regression tests you're using for GlusterFS?

   I've looked high and low for the unit tests that the GlusterFS devs are 
meants to be using (ala http://www.gluster.org/docs/index.php/GlusterFS_QA) 
so that I can do my own testing, but I've not been able to find them.

   If it's tests you've developed in-house, would you be interested in 
releasing them to the wider community?

Kind regards,

Geoff Kassel.

On Thu, 5 Feb 2009, Mickey Mazarick wrote:
  
  
I haven't done any full regression testing to see where the problem is
but the later TLA versions are causeing out storage servers to spike to
100% cpu usage and the clients never see any files. Our initial tests
are with ibverbs/HA but no performance translators.

Thanks!
-Mickey Mazarick

  



-- 



#!/bin/sh
# Startup script for gluster Mount system
volFiles=/glustermounts/
defaultcheckFile=customers
speclist=/etc/glusterfs-system.vol.ibverbs /etc/glusterfs-system.vol.ha 
/etc/glusterfs-system.vol.ibverbs /etc/glusterfs-system.vol.tcp
start() {
specfile=${1}
if [ $# -gt 1 ]; then
mountpt=${2}
else
mountpt=`echo ${specfile} |sed s#\.vol.*\\\$## |sed 
s#/.*/##`
mountpt=/mnt/${mountpt}
fi
logfile=`echo ${specfile} |sed s#\.vol.*\\\$## |sed s#/.*/##`
logfile=/var/${logfile}.log
pidfile=`echo ${specfile} |sed s#\.vol.*\\\$## |sed s#/.*/##`
pidfile=/var/run/${pidfile}.pid
echo mounting specfile:${specfile} at:${mountpt} with pid 
at:${pidfile}
currentpids=`pidof glusterfs`
currentpids=0 ${currentpids}
mountct=`mount |grep ${mountpt} |grep -c glusterfs`
if [ -f $pidfile ]; then
currentpid=`cat ${pidfile}`
pidct=`echo ${currentpids} |grep -c ${currentpid}`
if [ ${pidct} -eq 0 ]; then
rm -rf ${pidfile}
echo removing pid file: ${pidfile}
fi
if [ ${mountct} -lt 1 ]; then
echo Gluster System mount:${mountpt} died. Remounting.
stop ${mountpt} ${pidfile}
fi
else
rm -rf ${pidfile}
if [ ${mountct} -gt 0 ]; then
myupid=`ps -ef |grep /system |grep gluster |sed 
s#root\s*## |sed s#\s.*##`
if [ ${myupid} -gt 0 ]; then
   echo ${myupid}  ${pidfile}
else
   echo Gluster System mounted at:${mountpt} but with 
no pid. Remounting.
   stop ${mountpt} ${pidfile}
fi
fi
fi

if [ -e $pidfile ]; then
echo Gluster System Mount:${mountpt} is running with spec: 
${specfile}
#echo Gluster System Mount:${mountpt} is running.
return 0
else
#rm -rf /var/glustersystemclient.log
modprobe fuse
sleep 1.5
#rm -rf /var/glustersystemclient.log
mkdir ${mountpt}
rm -rf $pidfile
cmd=/usr/local/sbin/glusterfs -p $pidfile -l ${logfile} -L ERROR -f 
${specfile} --disable-direct-io-mode ${mountpt}
echo ${cmd}
${cmd}
#/usr/local/sbin/glusterfs -p $pidfile -l ${logfile} 

Re: [Gluster-devel] Problem with TLA ver 887

2009-02-08 Thread Geoff Kassel
Hi Mickey,
   Thanks for this.

Cheers,

Geoff Kassel.

On Mon, 9 Feb 2009, Mickey Mazarick wrote:
 Heh our tests are kind of an unholy mess... but here's the part I think
 is useful:
 We use a startup script that will iterate through vol files and mount
 the first available file on the list. We have a bunch of vol files that
 test a few different server configurations. After mountpoints are
 prepared we have other scripts that start virtual machine on the various
 mounts.

  In other words I have a directory called /glustermounts/ and in that
 directory I have the files:
 main.vol  main.vol.ib  main.vol.tcp stripe.vol.ha stripe.vol.tcp

 after running /etc/init.d/glustersystem start  I will have the
 following mount points:
 /system (our default mount, we actually store the vol files here)
 /mnt/main
 /mnt/stripe

 The output shows me if any vol file failed to mount and it automatically
 attempts the next one (ex mounting main.vol failed, trying
 main.vol.ib). We simply arrange vol files from most features to least.
 We have a separate script which starts up a virtual machine on each test
 mount. This is the actual test we use as it creates symbolic links,
 uses mmaps etc but it's pretty specific to us. This closely mirrors how
 we use it in production.

 I've included out startup script and I would suggest you simply run
 something similar to your production on a few mounts in the same way we
 have. I may share this with the entire group although there are probably
 better init scripts out there. This one does kill all processes attached
 to a mount point which is useful. Let me know if you have any questions!

 Thanks!

 -Mickey Mazarick

 Geoff Kassel wrote:
  Hi,
 As a fellow GlusterFS user, I was just wondering if you could point me
  to the regression tests you're using for GlusterFS?
 
 I've looked high and low for the unit tests that the GlusterFS devs
  are meants to be using (ala
  http://www.gluster.org/docs/index.php/GlusterFS_QA) so that I can do my
  own testing, but I've not been able to find them.
 
 If it's tests you've developed in-house, would you be interested in
  releasing them to the wider community?
 
  Kind regards,
 
  Geoff Kassel.
 
  On Thu, 5 Feb 2009, Mickey Mazarick wrote:
  I haven't done any full regression testing to see where the problem is
  but the later TLA versions are causeing out storage servers to spike to
  100% cpu usage and the clients never see any files. Our initial tests
  are with ibverbs/HA but no performance translators.
 
  Thanks!
  -Mickey Mazarick


___
Gluster-devel mailing list
Gluster-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Problem with TLA ver 887

2009-02-04 Thread Amar Tumballi (bulde)
Hi Mickey,
 Can you just attach gdb to server process and see 'bt' ?

Regards,
Amar

2009/2/4 Mickey Mazarick m...@digitaltadpole.com

 I haven't done any full regression testing to see where the problem is but
 the later TLA versions are causeing out storage servers to spike to 100% cpu
 usage and the clients never see any files. Our initial tests are with
 ibverbs/HA but no performance translators.

 Thanks!
 -Mickey Mazarick
 --


 ___
 Gluster-devel mailing list
 Gluster-devel@nongnu.org
 http://lists.nongnu.org/mailman/listinfo/gluster-devel




-- 
Amar Tumballi
Gluster/GlusterFS Hacker
[bulde on #gluster/irc.gnu.org]
http://www.zresearch.com - Commoditizing Super Storage!
___
Gluster-devel mailing list
Gluster-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Problem with TLA ver 887

2009-02-04 Thread Vikas Gorur
2009/2/5 Mickey Mazarick m...@digitaltadpole.com:
 I haven't done any full regression testing to see where the problem is but
 the later TLA versions are causeing out storage servers to spike to 100% cpu
 usage and the clients never see any files. Our initial tests are with
 ibverbs/HA but no performance translators.

Thanks for reporting. I found a bug that was introduced into
features/locks which would have caused a deadlock and thus 100% cpu.
It has been fixed in patch-892. Can you please verify that the bug is
no more?

Vikas
-- 
Engineer - Z Research
http://gluster.com/


___
Gluster-devel mailing list
Gluster-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/gluster-devel