Hi, Environment: Gluster Version: 3.8.3 Operating System: CentOS Linux 7 (Core) Kernel: Linux 3.10.0-327.28.3.el7.x86_64 Architecture: x86-64 Replicated 3-Node Volume ~400GB of around a million files
Description of Problem: One of the brick dies. The only suspect log I see is in the etc-glusterfs-glusterd.vol.log (shown below). Trying to get an idea of why the brick died and how it could be prevented in the future. During this time, I was forcing replication (find . | xargs stat on the mount). There were some services starting up as well that was using the gluster mount. [2016-09-13 20:01:50.033369] W [socket.c:590:__socket_rwv] 0-management: readv on /var/run/gluster/cfc57a83cf77779864900aa08380be93.socket failed (No data available) [2016-09-13 20:01:50.033830] I [MSGID: 106005] [glusterd-handler.c:5050:__glusterd_brick_rpc_notify] 0-management: Brick 172.17.32.28:/usr/local/volname/local-data/mirrored-data has disconnected from glusterd. [2016-09-13 20:01:50.121316] W [rpcsvc.c:265:rpcsvc_program_actor] 0-rpc-service: RPC program not available (req 1298437 330) for 172.17.32.28:49146 [2016-09-13 20:01:50.121339] E [rpcsvc.c:560:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully [2016-09-13 20:01:50.121383] W [rpcsvc.c:265:rpcsvc_program_actor] 0-rpc-service: RPC program not available (req 1298437 330) for 172.17.32.28:49146 [2016-09-13 20:01:50.121392] E [rpcsvc.c:560:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully The message "I [MSGID: 106005] [glusterd-handler.c:5050:__glusterd_brick_rpc_notify] 0-management: Brick 172.17.32.28:/usr/local/volname/local-data/mirrored-data has disconnected from glusterd." repeated 34 times between [2016-09-13 20:01:50.033830] and [2016-09-13 20:03:40.010862]
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users