Hi Markus, This is the first time I am coming across this particular backtrace/crash. Looking into it now. Have filed a bug @ http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2293
Mean time, can you try below options and see if it fixes issues: * stop all gluster processes (glusterd/glusterfs/glusterfsd) * mv glusterd config directory bash# mv /etc/glusterd /etc/glusterd.old (on both machines) * start glusterd on both machines, do gluster peer probe now Let me know the output.. Regards, Amar 2011/1/14 Markus Fröhlich <markus.froehl...@xidras.com> > I have two servers with SLES11 SP1 x86_64 and compiled last version of > glusterfs 3.1.1. > firewall is disabled on both nodes and they are on the same network. > > I put both hostnames in the hosts file, so that each node can resolv the > others hostname correctly > 192.168.8.104 virt-zabbix-02 > 192.168.8.105 virt-zabbix-03 > > this is my config on both nodes: "/etc/glusterfs/glusterd.vol" > volume management > type mgmt/glusterd > option working-directory /etc/glusterd > option transport-type socket,rdma > option transport.socket.keepalive-time 10 > option transport.socket.keepalive-interval 2 > end-volume > > virt-zabbix-02# gluster peer status > No peers present > > log: > [2011-01-13 19:53:31.576554] I > [glusterd-handler.c:674:glusterd_handle_cli_list_friends] glusterd: Received > cli list req > > this is okay, but then, when I want to add the other node to the cluster, > the "glusterfsd" dies on "virt-zabbix-02" where I type the command and a > core-dump file is generated: > virt-zabbix-02# gluster peer probe virt-zabbix-03 > > log virt-zabbix-02: > [2011-01-13 19:54:29.284735] I > [glusterd-handler.c:563:glusterd_handle_cli_probe] glusterd: Received CLI > probe req virt-zabbix-03 24007 > [2011-01-13 19:54:29.285110] I > [glusterd-handler.c:398:glusterd_friend_find] glusterd: Unable to find > hostname: virt-zabbix-03 > [2011-01-13 19:54:29.285136] I > [glusterd-handler.c:2618:glusterd_probe_begin] glusterd: Unable to find > peerinfo for host: virt-zabbix-03 (24007) > [2011-01-13 19:54:29.287625] W [rpc-transport.c:849:rpc_transport_load] > rpc-transport: missing 'option transport-type'. defaulting to "socket" > [2011-01-13 19:54:29.288496] I > [glusterd-handler.c:2600:glusterd_friend_add] glusterd: connect returned 0 > [2011-01-13 19:54:29.293369] I > [glusterd-utils.c:2101:glusterd_friend_find_by_hostname] glusterd: Friend > virt-zabbix-03 found.. state: 0 > [2011-01-13 19:54:29.302062] I > [glusterd3_1-mops.c:80:glusterd3_1_probe_cbk] glusterd: Received probe resp > from uuid: 255540da-4b86-46f2-963c-3214e2c5e28a, host: virt-zabbix-03 > [2011-01-13 19:54:29.302097] I > [glusterd-handler.c:386:glusterd_friend_find] glusterd: Unable to find peer > by uuid > [2011-01-13 19:54:29.302111] I > [glusterd-utils.c:2101:glusterd_friend_find_by_hostname] glusterd: Friend > virt-zabbix-03 found.. state: 0 > pending frames: > > patchset: v3.1.1 > signal received: 11 > time of crash: 2011-01-13 19:54:29 > configuration details: > argp 1 > backtrace 1 > dlfcn 1 > fdatasync 1 > libpthread 1 > llistxattr 1 > setfsid 1 > spinlock 1 > epoll.h 1 > xattr.h 1 > st_atim.tv_nsec 1 > package-string: glusterfs 3.1.1 > /lib64/libc.so.6(+0x329e0)[0x7f1cbbb589e0] > /usr/lib64/libgfrpc.so.0(rpc_transport_connect+0xc)[0x7f1cbc4c506c] > /usr/lib64/libgfrpc.so.0(rpc_clnt_submit+0x3d8)[0x7f1cbc4ca878] > > /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd_submit_request+0x15e)[0x7f1cba4203be] > > /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd3_1_friend_add+0x11b)[0x7f1cba424f3b] > > /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(+0x27b17)[0x7f1cba40db17] > > /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd_friend_sm+0x175)[0x7f1cba40d675] > > /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd3_1_probe_cbk+0x495)[0x7f1cba4281f5] > /usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa4)[0x7f1cbc4c9a94] > /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xc8)[0x7f1cbc4c9cd8] > /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x2e)[0x7f1cbc4c4f2e] > > /usr/lib64/glusterfs/3.1.1/rpc-transport/socket.so(socket_event_poll_in+0x3f)[0x7f1cba1def9f] > > /usr/lib64/glusterfs/3.1.1/rpc-transport/socket.so(socket_event_handler+0x114)[0x7f1cba1df0d4] > /usr/lib64/libglusterfs.so.0(+0x3a384)[0x7f1cbc70b384] > /usr/sbin/glusterd(main+0x23c)[0x4055dc] > /lib64/libc.so.6(__libc_start_main+0xe6)[0x7f1cbbb44bc6] > /usr/sbin/glusterd[0x4032c9] > --------- > > log virt-zabbix-03: > [2011-01-13 19:54:29.296723] I > [glusterd-handler.c:2387:glusterd_handle_probe_query] glusterd: Received > probe from uuid: a9b660c5-456d-4e96-9bdd-d23c917ae941 > [2011-01-13 19:54:29.296802] I > [glusterd-handler.c:386:glusterd_friend_find] glusterd: Unable to find peer > by uuid > [2011-01-13 19:54:29.297224] I > [glusterd-handler.c:398:glusterd_friend_find] glusterd: Unable to find > hostname: 192.168.8.104 > [2011-01-13 19:54:29.297278] I > [glusterd-handler.c:2401:glusterd_handle_probe_query] glusterd: Unable to > find peerinfo for host: 192.168.8.104 (24007) > [2011-01-13 19:54:29.300119] W [rpc-transport.c:849:rpc_transport_load] > rpc-transport: missing 'option transport-type'. defaulting to "socket" > [2011-01-13 19:54:29.304856] I > [glusterd-handler.c:2600:glusterd_friend_add] glusterd: connect returned 0 > [2011-01-13 19:54:29.304994] I > [glusterd-handler.c:2422:glusterd_handle_probe_query] glusterd: Responded to > virt-zabbix-03, op_ret: 0, op_errno: 0, ret: 0 > [2011-01-13 19:54:35.314773] E [socket.c:1656:socket_connect_finish] > management: connection to 192.168.8.104:24007 failed (Connection refused) > > > so I start the "gluserfsd" on virt-zabbix-02 again - a few secounds later > the glusterfsd dies on the other node virt-zabbix-03 and there also a > core-dump file is generated > > log virt-zabbix-02: > [2011-01-13 19:57:08.911495] I > [glusterd-handler.c:2387:glusterd_handle_probe_query] glusterd: Received > probe from uuid: 255540da-4b86-46f2-963c-3214e2c5e28a > [2011-01-13 19:57:08.911559] I > [glusterd-handler.c:386:glusterd_friend_find] glusterd: Unable to find peer > by uuid > [2011-01-13 19:57:08.911643] I > [glusterd-utils.c:2140:glusterd_friend_find_by_hostname] glusterd: Friend > 192.168.8.105 found.. state: 0 > [2011-01-13 19:57:08.911715] I > [glusterd-handler.c:2422:glusterd_handle_probe_query] glusterd: Responded to > 192.168.8.104, op_ret: 0, op_errno: 0, ret: 0 > [2011-01-13 19:57:11.956152] E [socket.c:1656:socket_connect_finish] > management: connection to 192.168.8.105:24007 failed (Connection refused) > > > log virt-zabbix-03: > [2011-01-13 19:57:08.913897] I > [glusterd-utils.c:2101:glusterd_friend_find_by_hostname] glusterd: Friend > 192.168.8.104 found.. state: 0 > [2011-01-13 19:57:08.915052] I > [glusterd3_1-mops.c:80:glusterd3_1_probe_cbk] glusterd: Received probe resp > from uuid: a9b660c5-456d-4e96-9bdd-d23c917ae941, host: 192.168.8.104 > [2011-01-13 19:57:08.915085] I > [glusterd-handler.c:386:glusterd_friend_find] glusterd: Unable to find peer > by uuid > [2011-01-13 19:57:08.915100] I > [glusterd-utils.c:2101:glusterd_friend_find_by_hostname] glusterd: Friend > 192.168.8.104 found.. state: 0 > pending frames: > > patchset: v3.1.1 > signal received: 11 > time of crash: 2011-01-13 19:57:08 > configuration details: > argp 1 > backtrace 1 > dlfcn 1 > fdatasync 1 > libpthread 1 > llistxattr 1 > setfsid 1 > spinlock 1 > epoll.h 1 > xattr.h 1 > st_atim.tv_nsec 1 > package-string: glusterfs 3.1.1 > /lib64/libc.so.6(+0x329e0)[0x7fe84e6ee9e0] > /usr/lib64/libgfrpc.so.0(rpc_transport_connect+0xc)[0x7fe84f05b06c] > /usr/lib64/libgfrpc.so.0(rpc_clnt_submit+0x3d8)[0x7fe84f060878] > > /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd_submit_request+0x15e)[0x7fe84cfb63be] > > /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd3_1_friend_add+0x11b)[0x7fe84cfbaf3b] > > /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(+0x27b17)[0x7fe84cfa3b17] > > /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd_friend_sm+0x175)[0x7fe84cfa3675] > > /usr/lib64/glusterfs/3.1.1/xlator/mgmt/glusterd.so(glusterd3_1_probe_cbk+0x495)[0x7fe84cfbe1f5] > /usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa4)[0x7fe84f05fa94] > /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xc8)[0x7fe84f05fcd8] > /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x2e)[0x7fe84f05af2e] > > /usr/lib64/glusterfs/3.1.1/rpc-transport/socket.so(socket_event_poll_in+0x3f)[0x7fe84cd74f9f] > > /usr/lib64/glusterfs/3.1.1/rpc-transport/socket.so(socket_event_handler+0x114)[0x7fe84cd750d4] > /usr/lib64/libglusterfs.so.0(+0x3a384)[0x7fe84f2a1384] > /usr/sbin/glusterd(main+0x23c)[0x4055dc] > /lib64/libc.so.6(__libc_start_main+0xe6)[0x7fe84e6dabc6] > /usr/sbin/glusterd[0x4032c9] > --------- > > > starting the glusterfsd on virt-zabbix-03 again, let die the glusterfsd on > virt-zabbix-02 and so on > so I make sure the daemon is stopped on both hosts. > the peer file generated on the nodes are different one is named with the > hostname, the other with the IP: > virt-zabbix-02:# cat /etc/glusterd/peers/virt-zabbix-03 > uuid= > state=0 > hostname1=virt-zabbix-03 > > virt-zabbix-03:# cat /etc/glusterd/peers/192.168.8.104 > uuid= > state=0 > hostname1=192.168.8.104 > > > so I see the uuid is empty in both files and I fill it with the uuid from > each others "/etc/glusterd/glusterd.info" file: > virt-zabbix-02:/ # cat /etc/glusterd/glusterd.info > UUID=a9b660c5-456d-4e96-9bdd-d23c917ae941 > virt-zabbix-03:/ # cat etc/glusterd/glusterd.info > UUID=255540da-4b86-46f2-963c-3214e2c5e28a > > virt-zabbix-02:/ # cat /etc/glusterd/peers/virt-zabbix-03 > uuid=255540da-4b86-46f2-963c-3214e2c5e28a > state=0 > hostname1=virt-zabbix-03 > > virt-zabbix-03:/ # cat /etc/glusterd/peers/192.168.8.104 > uuid=a9b660c5-456d-4e96-9bdd-d23c917ae941 > state=0 > hostname1=192.168.8.104 > > > now I start "glusterfsd" on both nodes again and both daemons keep running > and I can type the command: > virt-zabbix-02:/ # gluster peer status > Number of Peers: 1 > > Hostname: virt-zabbix-03 > Uuid: 255540da-4b86-46f2-963c-3214e2c5e28a > State: Establishing Connection (Connected) > > I'd like to create my first test volume: > gluster volume create mytest transport tcp virt-zabbix-02:/gfs1 > virt-zabbix-03:/gfs1 > Creation of volume mytest has been unsuccessful > Host virt-zabbix-03 not connected > > log virt-zabbix-02: > [2011-01-13 20:11:10.706931] I > [glusterd-handler.c:674:glusterd_handle_cli_list_friends] glusterd: Received > cli list req > [2011-01-13 20:12:20.950199] I > [glusterd-handler.c:785:glusterd_handle_create_volume] glusterd: Received > create volume req > [2011-01-13 20:12:20.950907] I > [glusterd-utils.c:2101:glusterd_friend_find_by_hostname] glusterd: Friend > virt-zabbix-03 found.. state: 0 > [2011-01-13 20:12:20.950935] I > [glusterd-utils.c:2062:glusterd_friend_find_by_uuid] glusterd: Friend > found.. state: Establishing Connection > [2011-01-13 20:12:20.950950] E > [glusterd-utils.c:2324:glusterd_new_brick_validate] glusterd: Host > virt-zabbix-03 not connected > [2011-01-13 20:12:20.951005] E > [glusterd-handler.c:906:glusterd_handle_create_volume] glusterd: Unlock on > opinfo failed > > no logfiles on virt-zabbix-03 > > not connected? strange! status info again: > virt-zabbix-02:/ # gluster peer status > Number of Peers: 1 > > Hostname: virt-zabbix-03 > Uuid: 255540da-4b86-46f2-963c-3214e2c5e28a > State: Establishing Connection (Connected) > > log virt-zabbix-02: > [2011-01-13 20:13:24.601901] I > [glusterd-handler.c:674:glusterd_handle_cli_list_friends] glusterd: Received > cli list req > > > so I restart the glusterfsd on virt-zabbix-03 and the daemon on > virt-zabbix-02 dies again > > has some one any idea whats going wrong? > > kind regards > > > > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users@gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users