This particular error gives me an impression that even the debug logs in Glusterd are still incomplete as it does not keep track of all the functions hit for a transaction. In few functions we do log a debug message with the ret value returning from the function but this is not followed globally for all the functions, not sure what was the reason for not adding this log at end of each function, was it intentional. But having the ret value logged at every function could have easily traced where exactly the problem had happened.
I am planning to add a debug log (the most suitable one would be a TRACE log) in complete Glusterd code path. Your thoughts? --Atin -------- Original Message -------- Subject: [Gluster-users] Volume add-brick: failed: (with no error message) Date: Tue, 15 Apr 2014 16:26:47 +0100 From: Iain Milne <[email protected]> To: [email protected] Hi folks, We've had a 2 node gluster array working great for the last year. Each brick is a 37TB xfs mount. It's now on Centos 6.5 (x64) running gluster 3.4.3-2 Volume Name: gfs Type: Distribute Volume ID: ddbb46bb-821e-44db-bc7e-32f43334f62c Status: Started Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: server1:/mnt/data Brick2: server2:/mnt/data We've just bought a new server (identical in every way to the previous two) and we're trying to get it added to the volume. The peering process goes fine: Number of Peers: 2 Hostname: server2 Uuid: 02f1a25b-afd8-49e2-8708-95456f6b8473 State: Peer in Cluster (Connected) Hostname: server3 Port: 24007 Uuid: 3fc9df26-bb49-4c74-8eae-4b3f37389224 State: Peer in Cluster (Connected) The only thing of interest (?) there is the addition of the port number for the new server. Neither of the old servers show a port, even when running the peer status command on any of the boxes. The main problem is the addition of the new server/brick: [root@server1 glusterfs]# gluster volume add-brick gfs server3:/mnt/data volume add-brick: failed: There's no error there at all: just a blank after the colon. The logs on server1 (the one trying to do the add): W [rpc-transport.c:175:rpc_transport_load] 0-rpc-transport: missing 'option transport-type'. defaulting to "socket" I [socket.c:3480:socket_init] 0-glusterfs: SSL support is NOT enabled I [socket.c:3495:socket_init] 0-glusterfs: using system polling thread I [cli-cmd-volume.c:1336:cli_check_gsync_present] 0-: geo-replication not installed I [cli-rpc-ops.c:1695:gf_cli_add_brick_cbk] 0-cli: Received resp to add brick I [input.c:36:cli_batch] 0-: Exiting with: -1 And the logs on server3 (the one being added): E [glusterd-op-sm.c:3719:glusterd_op_ac_stage_op] 0-management: Stage failed on operation 'Volume Add brick', Status : -1 The current storage array is live and in-use by users, so it can't be taken offline at short notice. For completeness, here's glusterd on server3 running in debug mode when the add-brick command was attempted: [2014-04-15 15:03:33.133976] D [glusterd-handler.c:549:__glusterd_handle_cluster_lock] 0-management: Received LOCK from uuid: 881743a9-b71e-45a9-8528-cc932837ebb8 [2014-04-15 15:03:33.134013] D [glusterd-utils.c:4936:glusterd_friend_find_by_uuid] 0-management: Friend found... state: Peer in Cluster [2014-04-15 15:03:33.134031] D [glusterd-op-sm.c:5355:glusterd_op_sm_inject_event] 0-management: Enqueue event: 'GD_OP_EVENT_LOCK' [2014-04-15 15:03:33.134051] D [glusterd-handler.c:572:__glusterd_handle_cluster_lock] 0-management: Returning 0 [2014-04-15 15:03:33.134065] D [glusterd-op-sm.c:5432:glusterd_op_sm] 0-management: Dequeued event of type: 'GD_OP_EVENT_LOCK' [2014-04-15 15:03:33.134083] D [glusterd-utils.c:340:glusterd_lock] 0-management: Cluster lock held by 881743a9-b71e-45a9-8528-cc932837ebb8 [2014-04-15 15:03:33.134096] D [glusterd-op-sm.c:2445:glusterd_op_ac_lock] 0-management: Lock Returned 0 [2014-04-15 15:03:33.134153] D [glusterd-handler.c:1776:glusterd_op_lock_send_resp] 0-management: Responded to lock, ret: 0 [2014-04-15 15:03:33.134171] D [glusterd-utils.c:5598:glusterd_sm_tr_log_transition_add] 0-management: Transitioning from 'Default' to 'Locked' due to event 'GD_OP_EVENT_LOCK' [2014-04-15 15:03:33.134187] D [glusterd-utils.c:5600:glusterd_sm_tr_log_transition_add] 0-management: returning 0 [2014-04-15 15:03:33.135409] D [glusterd-utils.c:4936:glusterd_friend_find_by_uuid] 0-management: Friend found... state: Peer in Cluster [2014-04-15 15:03:33.135452] D [glusterd-handler.c:604:glusterd_req_ctx_create] 0-management: Received op from uuid 881743a9-b71e-45a9-8528-cc932837ebb8 [2014-04-15 15:03:33.135481] D [glusterd-op-sm.c:5355:glusterd_op_sm_inject_event] 0-management: Enqueue event: 'GD_OP_EVENT_STAGE_OP' [2014-04-15 15:03:33.135497] D [glusterd-op-sm.c:5432:glusterd_op_sm] 0-management: Dequeued event of type: 'GD_OP_EVENT_STAGE_OP' [2014-04-15 15:03:33.135524] D [glusterd-utils.c:1209:glusterd_volinfo_find] 0-: Volume gfs found [2014-04-15 15:03:33.135537] D [glusterd-utils.c:1216:glusterd_volinfo_find] 0-: Returning 0 [2014-04-15 15:03:33.135554] D [glusterd-utils.c:5223:glusterd_is_rb_started] 0-: is_rb_started:status=0 [2014-04-15 15:03:33.135600] D [glusterd-utils.c:5232:glusterd_is_rb_paused] 0-: is_rb_paused:status=0 [2014-04-15 15:03:33.135643] D [glusterd-utils.c:803:glusterd_brickinfo_new] 0-management: Returning 0 [2014-04-15 15:03:33.135662] D [glusterd-utils.c:865:glusterd_brickinfo_new_from_brick] 0-management: Returning 0 [2014-04-15 15:03:33.135677] D [glusterd-utils.c:665:glusterd_volinfo_new] 0-management: Returning 0 [2014-04-15 15:03:33.135698] D [glusterd-utils.c:749:glusterd_volume_brickinfos_delete] 0-management: Returning 0 [2014-04-15 15:03:33.135713] D [glusterd-utils.c:777:glusterd_volinfo_delete] 0-management: Returning 0 [2014-04-15 15:03:33.135729] D [glusterd-utils.c:803:glusterd_brickinfo_new] 0-management: Returning 0 [2014-04-15 15:03:33.135742] D [glusterd-utils.c:865:glusterd_brickinfo_new_from_brick] 0-management: Returning 0 [2014-04-15 15:03:33.135755] D [glusterd-utils.c:665:glusterd_volinfo_new] 0-management: Returning 0 [2014-04-15 15:03:33.135771] D [glusterd-utils.c:749:glusterd_volume_brickinfos_delete] 0-management: Returning 0 [2014-04-15 15:03:33.135784] D [glusterd-utils.c:777:glusterd_volinfo_delete] 0-management: Returning 0 [2014-04-15 15:03:33.135797] D [glusterd-utils.c:803:glusterd_brickinfo_new] 0-management: Returning 0 [2014-04-15 15:03:33.135810] D [glusterd-utils.c:865:glusterd_brickinfo_new_from_brick] 0-management: Returning 0 [2014-04-15 15:03:33.136093] D [glusterd-utils.c:5029:glusterd_friend_find_by_hostname] 0-management: Unable to find friend: server3 [2014-04-15 15:03:33.136194] D [glusterd-utils.c:290:glusterd_is_local_addr] 0-management: 10.0.0.244 [2014-04-15 15:03:33.136755] D [glusterd-utils.c:257:glusterd_interface_search] 0-management: 10.0.0.244 is local address at interface em1 [2014-04-15 15:03:33.136778] D [glusterd-utils.c:5064:glusterd_hostname_to_uuid] 0-management: returning 0 [2014-04-15 15:03:33.136790] D [glusterd-utils.c:819:glusterd_resolve_brick] 0-management: Returning 0 [2014-04-15 15:03:33.136818] D [glusterd-utils.c:5215:glusterd_new_brick_validate] 0-management: returning 0 [2014-04-15 15:03:33.136849] D [glusterd-brick-ops.c:1177:glusterd_op_stage_add_brick] 0-management: Returning -1 [2014-04-15 15:03:33.136866] D [glusterd-op-sm.c:3975:glusterd_op_stage_validate] 0-management: Returning -1 [2014-04-15 15:03:33.136878] E [glusterd-op-sm.c:3719:glusterd_op_ac_stage_op] 0-management: Stage failed on operation 'Volume Add brick', Status : -1 [2014-04-15 15:03:33.136940] D [glusterd-handler.c:1891:glusterd_op_stage_send_resp] 0-management: Responded to stage, ret: 0 [2014-04-15 15:03:33.136959] D [glusterd-op-sm.c:3728:glusterd_op_ac_stage_op] 0-management: Returning with 0 [2014-04-15 15:03:33.136975] D [glusterd-utils.c:5598:glusterd_sm_tr_log_transition_add] 0-management: Transitioning from 'Locked' to 'Staged' due to event 'GD_OP_EVENT_STAGE_OP' [2014-04-15 15:03:33.136989] D [glusterd-utils.c:5600:glusterd_sm_tr_log_transition_add] 0-management: returning 0 [2014-04-15 15:03:33.138024] D [glusterd-handler.c:1824:__glusterd_handle_cluster_unlock] 0-management: Received UNLOCK from uuid: 881743a9-b71e-45a9-8528-cc932837ebb8 [2014-04-15 15:03:33.138063] D [glusterd-utils.c:4936:glusterd_friend_find_by_uuid] 0-management: Friend found... state: Peer in Cluster [2014-04-15 15:03:33.138105] D [glusterd-op-sm.c:5355:glusterd_op_sm_inject_event] 0-management: Enqueue event: 'GD_OP_EVENT_UNLOCK' [2014-04-15 15:03:33.138123] D [glusterd-op-sm.c:5432:glusterd_op_sm] 0-management: Dequeued event of type: 'GD_OP_EVENT_UNLOCK' [2014-04-15 15:03:33.138139] D [glusterd-op-sm.c:2469:glusterd_op_ac_unlock] 0-management: Unlock Returned 0 [2014-04-15 15:03:33.138192] D [glusterd-handler.c:1795:glusterd_op_unlock_send_resp] 0-management: Responded to unlock, ret: 0 [2014-04-15 15:03:33.138209] D [glusterd-utils.c:5598:glusterd_sm_tr_log_transition_add] 0-management: Transitioning from 'Staged' to 'Default' due to event 'GD_OP_EVENT_UNLOCK' [2014-04-15 15:03:33.138224] D [glusterd-utils.c:5600:glusterd_sm_tr_log_transition_add] 0-management: returning 0 _______________________________________________ Gluster-users mailing list [email protected] http://supercolony.gluster.org/mailman/listinfo/gluster-users _______________________________________________ Gluster-devel mailing list [email protected] https://lists.nongnu.org/mailman/listinfo/gluster-devel
