Re: [Gluster-users] glusterfsd won't restart on one brick

2013-07-31 Thread Joel Young
Ok Folks, Thanks for helping out. I kicked off my users and forced a reboot and it looks like it came back up fine. On Wed, Jul 31, 2013 at 11:45 AM, Joe Julian wrote: > On 07/31/2013 11:42 AM, Joel Young wrote: >> >> On Wed, Jul 31, 2013 at 10:29 AM, Joe Julian wrote: >>> >>> To kill a zombie

Re: [Gluster-users] glusterfsd won't restart on one brick

2013-07-31 Thread Joe Julian
On 07/31/2013 11:42 AM, Joel Young wrote: On Wed, Jul 31, 2013 at 10:29 AM, Joe Julian wrote: To kill a zombie process, you have to kill the parent process. ps -p 23744 -o ppid= If the result is 1, then you are stuck rebooting. Otherwise, kill that process. Thanks Joe. Unfortunately the par

Re: [Gluster-users] glusterfsd won't restart on one brick

2013-07-31 Thread Joel Young
On Wed, Jul 31, 2013 at 10:29 AM, Joe Julian wrote: > To kill a zombie process, you have to kill the parent process. > > ps -p 23744 -o ppid= > > If the result is 1, then you are stuck rebooting. Otherwise, kill that > process. Thanks Joe. Unfortunately the parent pid is indeed 1 and I'm not lik

Re: [Gluster-users] glusterfsd won't restart on one brick

2013-07-31 Thread Joe Julian
To kill a zombie process, you have to kill the parent process. ps -p 23744 -o ppid= If the result is 1, then you are stuck rebooting. Otherwise, kill that process. Deleting a filename does not close the named pipe, so that caused the failure below. Joel Young wrote: >On Tue, Jul 30, 2013 at 1

Re: [Gluster-users] glusterfsd won't restart on one brick

2013-07-31 Thread Joel Young
On Tue, Jul 30, 2013 at 10:49 PM, Kaushal M wrote: > I think I've found the problem. The problem is not with the brick port, but > instead with > the unix domain socket used for communication between glusterd and glusterfsd. Makes sense. > So this is most likely due the zombie process 23744 sti

Re: [Gluster-users] glusterfsd won't restart on one brick

2013-07-30 Thread Kaushal M
I think I've found the problem. The problem is not with the brick port, but instead with the unix domain socket used for communication between glusterd and glusterfsd. >From the log you provided, > [2013-07-29 23:34:41.949089] I [glusterfsd.c:1910:main] > 0-/usr/sbin/glusterfsd: Started running /

Re: [Gluster-users] glusterfsd won't restart on one brick

2013-07-30 Thread Joel Young
Kaushal, On Mon, Jul 29, 2013 at 11:59 PM, Kaushal M wrote: > Some other process is listening on 49157 on ir2, but netstat and lsof > don't provide any answers to what process it is. I'm not confident that that is really happening. I tried hand executing the glusterfsd command with several othe

Re: [Gluster-users] glusterfsd won't restart on one brick

2013-07-30 Thread Kaushal M
Some other process is listening on 49157 on ir2, but netstat and lsof don't provide any answers to what process it is. Googling for more information I came across a possible answer to such processes. Apparently services which register with portmap usually are of this type. Can you do a 'rpcinfo -p

Re: [Gluster-users] glusterfsd won't restart on one brick

2013-07-29 Thread Joel Young
On Mon, Jul 29, 2013 at 10:27 PM, Vijay Bellur wrote: > Gluster NFS process should listen by default on 2049 with 3.4. Can you please check if port 2049 is in use? > netstat -ntlp | grep 2049 can help. [root@ir2 ~]# netstat -ntlp | grep 2049 tcp0 0 0.0.0.0:20490.0.0.0:* L

Re: [Gluster-users] glusterfsd won't restart on one brick

2013-07-29 Thread Vijay Bellur
On 07/30/2013 10:08 AM, Joel Young wrote: On Mon, Jul 29, 2013 at 9:30 PM, Krishnan Parthasarathi mailto:kpart...@redhat.com>> wrote: > We need to find out what is the glusterfs process with pid 7798 and understand > what is causing glusterd, which is also a portmapping service for bricks/clien

Re: [Gluster-users] glusterfsd won't restart on one brick

2013-07-29 Thread Joel Young
On Mon, Jul 29, 2013 at 9:30 PM, Krishnan Parthasarathi wrote: > We need to find out what is the glusterfs process with pid 7798 and understand > what is causing glusterd, which is also a portmapping service for bricks/clients, > assign a port that is already bound. ps -ef | grep 7798 root 7

Re: [Gluster-users] glusterfsd won't restart on one brick

2013-07-29 Thread Krishnan Parthasarathi
Joel, I was hoping to see the pid and executable name of the process listening on the port, when you observe the "Address already in use" error message in the log file. The lsof output did help with that. We need to find out what is the glusterfs process with pid 7798 and understand what is causi

Re: [Gluster-users] glusterfsd won't restart on one brick

2013-07-29 Thread Joel Young
Krishnan, On Mon, Jul 29, 2013 at 8:24 PM, Krishnan Parthasarathi wrote: > Could you check using netstat, what other process is listening on > the port, around the time of failure? [root@ir2 ~]# netstat -ntlp | grep 49157 tcp0 0 0.0.0.0:49157 0.0.0.0:* LISTEN - Don

Re: [Gluster-users] glusterfsd won't restart on one brick

2013-07-29 Thread Krishnan Parthasarathi
Joel, >From the logs, we see bind(3) is failing with "Address already in use". Could you check using netstat, what other process is listening on the port, around the time of failure? # netstat -ntlp | grep where brick_port can be found in the logs, see "--xlator-option home-server.listen-port=

[Gluster-users] glusterfsd won't restart on one brick

2013-07-29 Thread Joel Young
On fedora 19, the glusterfs-3.4.0 package just went out replacing 3.4 beta4. On the systemctl restart glusterd.service, one of the glusterfsd won't restart. in /var/log/glusterfs/bricks/lhome-gluster_home.log I see: [2013-07-29 23:34:41.949089] I [glusterfsd.c:1910:main] 0-/usr/sbin/glusterfsd: