[Gluster-users] glusterfs 4.1.6 error in starting glusterd service

2019-01-16 Thread Amudhan P
Hi,

In short, when I started glusterd service I am getting following error msg
in the glusterd.log file in one server.
what needs to be done?

error logged in glusterd.log

[2019-01-15 17:50:13.956053] I [MSGID: 100030] [glusterfsd.c:2741:main]
0-/usr/local/sbin/glusterd: Started running /usr/local/sbin/glusterd
version 4.1.6 (args: /usr/local/sbin/glusterd -p /var/run/glusterd.pid)
[2019-01-15 17:50:13.960131] I [MSGID: 106478] [glusterd.c:1423:init]
0-management: Maximum allowed open file descriptors set to 65536
[2019-01-15 17:50:13.960193] I [MSGID: 106479] [glusterd.c:1481:init]
0-management: Using /var/lib/glusterd as working directory
[2019-01-15 17:50:13.960212] I [MSGID: 106479] [glusterd.c:1486:init]
0-management: Using /var/run/gluster as pid file working directory
[2019-01-15 17:50:13.964437] W [MSGID: 103071]
[rdma.c:4629:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event
channel creation failed [No such device]
[2019-01-15 17:50:13.964474] W [MSGID: 103055] [rdma.c:4938:init]
0-rdma.management: Failed to initialize IB Device
[2019-01-15 17:50:13.964491] W [rpc-transport.c:351:rpc_transport_load]
0-rpc-transport: 'rdma' initialization failed
[2019-01-15 17:50:13.964560] W [rpcsvc.c:1781:rpcsvc_create_listener]
0-rpc-service: cannot create listener, initing the transport failed
[2019-01-15 17:50:13.964579] E [MSGID: 106244] [glusterd.c:1764:init]
0-management: creation of 1 listeners failed, continuing with succeeded
transport
[2019-01-15 17:50:14.967681] I [MSGID: 106513]
[glusterd-store.c:2240:glusterd_restore_op_version] 0-glusterd: retrieved
op-version: 40100
[2019-01-15 17:50:14.973931] I [MSGID: 106544]
[glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID:
d6bf51a7-c296-492f-8dac-e81efa9dd22d
[2019-01-15 17:50:15.046620] E [MSGID: 101032]
[store.c:441:gf_store_handle_retrieve] 0-: Path corresponding to
/var/lib/glusterd/vols/gfs-tst/bricks/IP.3:-media-disk3-brick3. [No such
file or directory]
[2019-01-15 17:50:15.046685] E [MSGID: 106201]
[glusterd-store.c:3384:glusterd_store_retrieve_volumes] 0-management:
Unable to restore volume: gfs-tst
[2019-01-15 17:50:15.046718] E [MSGID: 101019] [xlator.c:720:xlator_init]
0-management: Initialization of volume 'management' failed, review your
volfile again
[2019-01-15 17:50:15.046732] E [MSGID: 101066]
[graph.c:367:glusterfs_graph_init] 0-management: initializing translator
failed
[2019-01-15 17:50:15.046741] E [MSGID: 101176]
[graph.c:738:glusterfs_graph_activate] 0-graph: init failed
[2019-01-15 17:50:15.047171] W [glusterfsd.c:1514:cleanup_and_exit]
(-->/usr/local/sbin/glusterd(glusterfs_volumes



In long, I am trying to simulate a situation. where volume stoped
abnormally and
entire cluster restarted with some missing disks.

My test cluster is set up with 3 nodes and each has four disks, I have
setup a volume with disperse 4+2.
In Node-3 2 disks have failed, to replace I have shutdown all system

below are the steps done.

1. umount from client machine
2. shutdown all system by running `shutdown -h now` command ( without
stopping volume and stop service)
3. replace faulty disk in Node-3
4. powered ON all system
5. format replaced drives, and mount all drives
6. start glusterd service in all node (success)
7. Now running `voulume status` command from node-3
output : [2019-01-15 16:52:17.718422]  : v status : FAILED : Staging failed
on 0083ec0c-40bf-472a-a128-458924e56c96. Please check log file for details.
8. running `voulume start gfs-tst` command from node-3
output : [2019-01-15 16:53:19.410252]  : v start gfs-tst : FAILED : Volume
gfs-tst already started

9. running `gluster v status` in other node. showing all brick available
but 'self-heal daemon' not running
@gfstst-node2:~$ sudo gluster v status
Status of volume: gfs-tst
Gluster process TCP Port  RDMA Port  Online  Pid
--
Brick IP.2:/media/disk1/brick1  49152 0  Y   1517
Brick IP.4:/media/disk1/brick1  49152 0  Y   1668
Brick IP.2:/media/disk2/brick2  49153 0  Y   1522
Brick IP.4:/media/disk2/brick2  49153 0  Y   1678
Brick IP.2:/media/disk3/brick3  49154 0  Y   1527
Brick IP.4:/media/disk3/brick3  49154 0  Y   1677
Brick IP.2:/media/disk4/brick4  49155 0  Y   1541
Brick IP.4:/media/disk4/brick4  49155 0  Y   1683
Self-heal Daemon on localhost   N/A   N/AY
 2662
Self-heal Daemon on IP.4N/A   N/AY   2786

10. in the above output 'volume already started'. so, running `reset-brick`
command
   v reset-brick gfs-tst IP.3:/media/disk3/brick3 IP.3:/media/disk3/brick3
commit force

output : [2019-01-15 16:57:37.916942]  : v reset-brick gfs-tst
IP.3:/media/disk3/brick3 IP.3:/media/disk3/brick3 commit force : FAILED :
/media/disk3

Re: [Gluster-users] glusterfs 4.1.6 error in starting glusterd service

2019-01-16 Thread Atin Mukherjee
This is a case of partial write of a transaction and as the host ran out of
space for the root partition where all the glusterd related configurations
are persisted, the transaction couldn't be written and hence the new
(replaced) brick's information wasn't persisted in the configuration. The
workaround for this is to copy the content of
/var/lib/glusterd/vols/gfs-tst/ from one of the nodes in the trusted
storage pool to the node where glusterd service fails to come up and post
that restarting the glusterd service should be able to make peer status
reporting all nodes healthy and connected.

On Wed, Jan 16, 2019 at 3:49 PM Amudhan P  wrote:

> Hi,
>
> In short, when I started glusterd service I am getting following error msg
> in the glusterd.log file in one server.
> what needs to be done?
>
> error logged in glusterd.log
>
> [2019-01-15 17:50:13.956053] I [MSGID: 100030] [glusterfsd.c:2741:main]
> 0-/usr/local/sbin/glusterd: Started running /usr/local/sbin/glusterd
> version 4.1.6 (args: /usr/local/sbin/glusterd -p /var/run/glusterd.pid)
> [2019-01-15 17:50:13.960131] I [MSGID: 106478] [glusterd.c:1423:init]
> 0-management: Maximum allowed open file descriptors set to 65536
> [2019-01-15 17:50:13.960193] I [MSGID: 106479] [glusterd.c:1481:init]
> 0-management: Using /var/lib/glusterd as working directory
> [2019-01-15 17:50:13.960212] I [MSGID: 106479] [glusterd.c:1486:init]
> 0-management: Using /var/run/gluster as pid file working directory
> [2019-01-15 17:50:13.964437] W [MSGID: 103071]
> [rdma.c:4629:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event
> channel creation failed [No such device]
> [2019-01-15 17:50:13.964474] W [MSGID: 103055] [rdma.c:4938:init]
> 0-rdma.management: Failed to initialize IB Device
> [2019-01-15 17:50:13.964491] W [rpc-transport.c:351:rpc_transport_load]
> 0-rpc-transport: 'rdma' initialization failed
> [2019-01-15 17:50:13.964560] W [rpcsvc.c:1781:rpcsvc_create_listener]
> 0-rpc-service: cannot create listener, initing the transport failed
> [2019-01-15 17:50:13.964579] E [MSGID: 106244] [glusterd.c:1764:init]
> 0-management: creation of 1 listeners failed, continuing with succeeded
> transport
> [2019-01-15 17:50:14.967681] I [MSGID: 106513]
> [glusterd-store.c:2240:glusterd_restore_op_version] 0-glusterd: retrieved
> op-version: 40100
> [2019-01-15 17:50:14.973931] I [MSGID: 106544]
> [glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID:
> d6bf51a7-c296-492f-8dac-e81efa9dd22d
> [2019-01-15 17:50:15.046620] E [MSGID: 101032]
> [store.c:441:gf_store_handle_retrieve] 0-: Path corresponding to
> /var/lib/glusterd/vols/gfs-tst/bricks/IP.3:-media-disk3-brick3. [No such
> file or directory]
> [2019-01-15 17:50:15.046685] E [MSGID: 106201]
> [glusterd-store.c:3384:glusterd_store_retrieve_volumes] 0-management:
> Unable to restore volume: gfs-tst
> [2019-01-15 17:50:15.046718] E [MSGID: 101019] [xlator.c:720:xlator_init]
> 0-management: Initialization of volume 'management' failed, review your
> volfile again
> [2019-01-15 17:50:15.046732] E [MSGID: 101066]
> [graph.c:367:glusterfs_graph_init] 0-management: initializing translator
> failed
> [2019-01-15 17:50:15.046741] E [MSGID: 101176]
> [graph.c:738:glusterfs_graph_activate] 0-graph: init failed
> [2019-01-15 17:50:15.047171] W [glusterfsd.c:1514:cleanup_and_exit]
> (-->/usr/local/sbin/glusterd(glusterfs_volumes
>
>
>
> In long, I am trying to simulate a situation. where volume stoped
> abnormally and
> entire cluster restarted with some missing disks.
>
> My test cluster is set up with 3 nodes and each has four disks, I have
> setup a volume with disperse 4+2.
> In Node-3 2 disks have failed, to replace I have shutdown all system
>
> below are the steps done.
>
> 1. umount from client machine
> 2. shutdown all system by running `shutdown -h now` command ( without
> stopping volume and stop service)
> 3. replace faulty disk in Node-3
> 4. powered ON all system
> 5. format replaced drives, and mount all drives
> 6. start glusterd service in all node (success)
> 7. Now running `voulume status` command from node-3
> output : [2019-01-15 16:52:17.718422]  : v status : FAILED : Staging
> failed on 0083ec0c-40bf-472a-a128-458924e56c96. Please check log file for
> details.
> 8. running `voulume start gfs-tst` command from node-3
> output : [2019-01-15 16:53:19.410252]  : v start gfs-tst : FAILED : Volume
> gfs-tst already started
>
> 9. running `gluster v status` in other node. showing all brick available
> but 'self-heal daemon' not running
> @gfstst-node2:~$ sudo gluster v status
> Status of volume: gfs-tst
> Gluster process TCP Port  RDMA Port  Online
> Pid
>
> --
> Brick IP.2:/media/disk1/brick1  49152 0  Y   1517
> Brick IP.4:/media/disk1/brick1  49152 0  Y   1668
> Brick IP.2:/media/disk2/brick2  49153 0  Y   1522
> Brick IP.4:/media

Re: [Gluster-users] glusterfs 4.1.6 error in starting glusterd service

2019-01-16 Thread Amudhan P
Atin,
I have copied the content of 'gfs-tst' from vol folder in another node.
when starting service again fails with error msg in glusterd.log file.

[2019-01-15 20:16:59.513023] I [MSGID: 100030] [glusterfsd.c:2741:main]
0-/usr/local/sbin/glusterd: Started running /usr/local/sbin/glusterd
version 4.1.6 (args: /usr/local/sbin/glusterd -p /var/run/glusterd.pid)
[2019-01-15 20:16:59.517164] I [MSGID: 106478] [glusterd.c:1423:init]
0-management: Maximum allowed open file descriptors set to 65536
[2019-01-15 20:16:59.517264] I [MSGID: 106479] [glusterd.c:1481:init]
0-management: Using /var/lib/glusterd as working directory
[2019-01-15 20:16:59.517283] I [MSGID: 106479] [glusterd.c:1486:init]
0-management: Using /var/run/gluster as pid file working directory
[2019-01-15 20:16:59.521508] W [MSGID: 103071]
[rdma.c:4629:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event
channel creation failed [No such device]
[2019-01-15 20:16:59.521544] W [MSGID: 103055] [rdma.c:4938:init]
0-rdma.management: Failed to initialize IB Device
[2019-01-15 20:16:59.521562] W [rpc-transport.c:351:rpc_transport_load]
0-rpc-transport: 'rdma' initialization failed
[2019-01-15 20:16:59.521629] W [rpcsvc.c:1781:rpcsvc_create_listener]
0-rpc-service: cannot create listener, initing the transport failed
[2019-01-15 20:16:59.521648] E [MSGID: 106244] [glusterd.c:1764:init]
0-management: creation of 1 listeners failed, continuing with succeeded
transport
[2019-01-15 20:17:00.529390] I [MSGID: 106513]
[glusterd-store.c:2240:glusterd_restore_op_version] 0-glusterd: retrieved
op-version: 40100
[2019-01-15 20:17:00.608354] I [MSGID: 106544]
[glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID:
d6bf51a7-c296-492f-8dac-e81efa9dd22d
[2019-01-15 20:17:00.650911] W [MSGID: 106425]
[glusterd-store.c:2643:glusterd_store_retrieve_bricks] 0-management: failed
to get statfs() call on brick /media/disk4/brick4 [No such file or
directory]
[2019-01-15 20:17:00.691240] I [MSGID: 106498]
[glusterd-handler.c:3614:glusterd_friend_add_from_peerinfo] 0-management:
connect returned 0
[2019-01-15 20:17:00.691307] W [MSGID: 106061]
[glusterd-handler.c:3408:glusterd_transport_inet_options_build] 0-glusterd:
Failed to get tcp-user-timeout
[2019-01-15 20:17:00.691331] I [rpc-clnt.c:1059:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2019-01-15 20:17:00.692547] E [MSGID: 106187]
[glusterd-store.c:4662:glusterd_resolve_all_bricks] 0-glusterd: resolve
brick failed in restore
[2019-01-15 20:17:00.692582] E [MSGID: 101019] [xlator.c:720:xlator_init]
0-management: Initialization of volume 'management' failed, review your
volfile again
[2019-01-15 20:17:00.692597] E [MSGID: 101066]
[graph.c:367:glusterfs_graph_init] 0-management: initializing translator
failed
[2019-01-15 20:17:00.692607] E [MSGID: 101176]
[graph.c:738:glusterfs_graph_activate] 0-graph: init failed
[2019-01-15 20:17:00.693004] W [glusterfsd.c:1514:cleanup_and_exit]
(-->/usr/local/sbin/glusterd(glusterfs_volumes_init+0xc2) [0x409f52]
-->/usr/local/sbin/glusterd(glusterfs_process_volfp+0x151) [0x409e41]
-->/usr/local/sbin/glusterd(cleanup_and_exit+0x5f) [0x40942f] ) 0-:
received signum (-1), shutting down


On Wed, Jan 16, 2019 at 4:34 PM Atin Mukherjee  wrote:

> This is a case of partial write of a transaction and as the host ran out
> of space for the root partition where all the glusterd related
> configurations are persisted, the transaction couldn't be written and hence
> the new (replaced) brick's information wasn't persisted in the
> configuration. The workaround for this is to copy the content of
> /var/lib/glusterd/vols/gfs-tst/ from one of the nodes in the trusted
> storage pool to the node where glusterd service fails to come up and post
> that restarting the glusterd service should be able to make peer status
> reporting all nodes healthy and connected.
>
> On Wed, Jan 16, 2019 at 3:49 PM Amudhan P  wrote:
>
>> Hi,
>>
>> In short, when I started glusterd service I am getting following error
>> msg in the glusterd.log file in one server.
>> what needs to be done?
>>
>> error logged in glusterd.log
>>
>> [2019-01-15 17:50:13.956053] I [MSGID: 100030] [glusterfsd.c:2741:main]
>> 0-/usr/local/sbin/glusterd: Started running /usr/local/sbin/glusterd
>> version 4.1.6 (args: /usr/local/sbin/glusterd -p /var/run/glusterd.pid)
>> [2019-01-15 17:50:13.960131] I [MSGID: 106478] [glusterd.c:1423:init]
>> 0-management: Maximum allowed open file descriptors set to 65536
>> [2019-01-15 17:50:13.960193] I [MSGID: 106479] [glusterd.c:1481:init]
>> 0-management: Using /var/lib/glusterd as working directory
>> [2019-01-15 17:50:13.960212] I [MSGID: 106479] [glusterd.c:1486:init]
>> 0-management: Using /var/run/gluster as pid file working directory
>> [2019-01-15 17:50:13.964437] W [MSGID: 103071]
>> [rdma.c:4629:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event
>> channel creation failed [No such device]
>> [2019-01-15 17:50:13.964474] W [MSGID: 103055] [rdma.c:4938:init]

Re: [Gluster-users] glusterfs 4.1.6 error in starting glusterd service

2019-01-16 Thread Atin Mukherjee
On Wed, Jan 16, 2019 at 5:02 PM Amudhan P  wrote:

> Atin,
> I have copied the content of 'gfs-tst' from vol folder in another node.
> when starting service again fails with error msg in glusterd.log file.
>
> [2019-01-15 20:16:59.513023] I [MSGID: 100030] [glusterfsd.c:2741:main]
> 0-/usr/local/sbin/glusterd: Started running /usr/local/sbin/glusterd
> version 4.1.6 (args: /usr/local/sbin/glusterd -p /var/run/glusterd.pid)
> [2019-01-15 20:16:59.517164] I [MSGID: 106478] [glusterd.c:1423:init]
> 0-management: Maximum allowed open file descriptors set to 65536
> [2019-01-15 20:16:59.517264] I [MSGID: 106479] [glusterd.c:1481:init]
> 0-management: Using /var/lib/glusterd as working directory
> [2019-01-15 20:16:59.517283] I [MSGID: 106479] [glusterd.c:1486:init]
> 0-management: Using /var/run/gluster as pid file working directory
> [2019-01-15 20:16:59.521508] W [MSGID: 103071]
> [rdma.c:4629:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event
> channel creation failed [No such device]
> [2019-01-15 20:16:59.521544] W [MSGID: 103055] [rdma.c:4938:init]
> 0-rdma.management: Failed to initialize IB Device
> [2019-01-15 20:16:59.521562] W [rpc-transport.c:351:rpc_transport_load]
> 0-rpc-transport: 'rdma' initialization failed
> [2019-01-15 20:16:59.521629] W [rpcsvc.c:1781:rpcsvc_create_listener]
> 0-rpc-service: cannot create listener, initing the transport failed
> [2019-01-15 20:16:59.521648] E [MSGID: 106244] [glusterd.c:1764:init]
> 0-management: creation of 1 listeners failed, continuing with succeeded
> transport
> [2019-01-15 20:17:00.529390] I [MSGID: 106513]
> [glusterd-store.c:2240:glusterd_restore_op_version] 0-glusterd: retrieved
> op-version: 40100
> [2019-01-15 20:17:00.608354] I [MSGID: 106544]
> [glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID:
> d6bf51a7-c296-492f-8dac-e81efa9dd22d
> [2019-01-15 20:17:00.650911] W [MSGID: 106425]
> [glusterd-store.c:2643:glusterd_store_retrieve_bricks] 0-management: failed
> to get statfs() call on brick /media/disk4/brick4 [No such file or
> directory]
>

This means that underlying brick /media/disk4/brick4 doesn't exist. You
already mentioned that you had replaced the faulty disk, but have you not
mounted it yet?


> [2019-01-15 20:17:00.691240] I [MSGID: 106498]
> [glusterd-handler.c:3614:glusterd_friend_add_from_peerinfo] 0-management:
> connect returned 0
> [2019-01-15 20:17:00.691307] W [MSGID: 106061]
> [glusterd-handler.c:3408:glusterd_transport_inet_options_build] 0-glusterd:
> Failed to get tcp-user-timeout
> [2019-01-15 20:17:00.691331] I [rpc-clnt.c:1059:rpc_clnt_connection_init]
> 0-management: setting frame-timeout to 600
> [2019-01-15 20:17:00.692547] E [MSGID: 106187]
> [glusterd-store.c:4662:glusterd_resolve_all_bricks] 0-glusterd: resolve
> brick failed in restore
> [2019-01-15 20:17:00.692582] E [MSGID: 101019] [xlator.c:720:xlator_init]
> 0-management: Initialization of volume 'management' failed, review your
> volfile again
> [2019-01-15 20:17:00.692597] E [MSGID: 101066]
> [graph.c:367:glusterfs_graph_init] 0-management: initializing translator
> failed
> [2019-01-15 20:17:00.692607] E [MSGID: 101176]
> [graph.c:738:glusterfs_graph_activate] 0-graph: init failed
> [2019-01-15 20:17:00.693004] W [glusterfsd.c:1514:cleanup_and_exit]
> (-->/usr/local/sbin/glusterd(glusterfs_volumes_init+0xc2) [0x409f52]
> -->/usr/local/sbin/glusterd(glusterfs_process_volfp+0x151) [0x409e41]
> -->/usr/local/sbin/glusterd(cleanup_and_exit+0x5f) [0x40942f] ) 0-:
> received signum (-1), shutting down
>
>
> On Wed, Jan 16, 2019 at 4:34 PM Atin Mukherjee 
> wrote:
>
>> This is a case of partial write of a transaction and as the host ran out
>> of space for the root partition where all the glusterd related
>> configurations are persisted, the transaction couldn't be written and hence
>> the new (replaced) brick's information wasn't persisted in the
>> configuration. The workaround for this is to copy the content of
>> /var/lib/glusterd/vols/gfs-tst/ from one of the nodes in the trusted
>> storage pool to the node where glusterd service fails to come up and post
>> that restarting the glusterd service should be able to make peer status
>> reporting all nodes healthy and connected.
>>
>> On Wed, Jan 16, 2019 at 3:49 PM Amudhan P  wrote:
>>
>>> Hi,
>>>
>>> In short, when I started glusterd service I am getting following error
>>> msg in the glusterd.log file in one server.
>>> what needs to be done?
>>>
>>> error logged in glusterd.log
>>>
>>> [2019-01-15 17:50:13.956053] I [MSGID: 100030] [glusterfsd.c:2741:main]
>>> 0-/usr/local/sbin/glusterd: Started running /usr/local/sbin/glusterd
>>> version 4.1.6 (args: /usr/local/sbin/glusterd -p /var/run/glusterd.pid)
>>> [2019-01-15 17:50:13.960131] I [MSGID: 106478] [glusterd.c:1423:init]
>>> 0-management: Maximum allowed open file descriptors set to 65536
>>> [2019-01-15 17:50:13.960193] I [MSGID: 106479] [glusterd.c:1481:init]
>>> 0-management: Using /var/lib/glusterd as working directory
>>> [2019-0

Re: [Gluster-users] glusterfs 4.1.6 error in starting glusterd service

2019-01-16 Thread Amudhan P
Yes, I did mount bricks but the folder 'brick4' was still not created
inside the brick.
Do I need to create this folder because when I run replace-brick it will
create folder inside the brick. I have seen this behavior before when
running replace-brick or heal begins.

On Wed, Jan 16, 2019 at 5:05 PM Atin Mukherjee  wrote:

>
>
> On Wed, Jan 16, 2019 at 5:02 PM Amudhan P  wrote:
>
>> Atin,
>> I have copied the content of 'gfs-tst' from vol folder in another node.
>> when starting service again fails with error msg in glusterd.log file.
>>
>> [2019-01-15 20:16:59.513023] I [MSGID: 100030] [glusterfsd.c:2741:main]
>> 0-/usr/local/sbin/glusterd: Started running /usr/local/sbin/glusterd
>> version 4.1.6 (args: /usr/local/sbin/glusterd -p /var/run/glusterd.pid)
>> [2019-01-15 20:16:59.517164] I [MSGID: 106478] [glusterd.c:1423:init]
>> 0-management: Maximum allowed open file descriptors set to 65536
>> [2019-01-15 20:16:59.517264] I [MSGID: 106479] [glusterd.c:1481:init]
>> 0-management: Using /var/lib/glusterd as working directory
>> [2019-01-15 20:16:59.517283] I [MSGID: 106479] [glusterd.c:1486:init]
>> 0-management: Using /var/run/gluster as pid file working directory
>> [2019-01-15 20:16:59.521508] W [MSGID: 103071]
>> [rdma.c:4629:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event
>> channel creation failed [No such device]
>> [2019-01-15 20:16:59.521544] W [MSGID: 103055] [rdma.c:4938:init]
>> 0-rdma.management: Failed to initialize IB Device
>> [2019-01-15 20:16:59.521562] W [rpc-transport.c:351:rpc_transport_load]
>> 0-rpc-transport: 'rdma' initialization failed
>> [2019-01-15 20:16:59.521629] W [rpcsvc.c:1781:rpcsvc_create_listener]
>> 0-rpc-service: cannot create listener, initing the transport failed
>> [2019-01-15 20:16:59.521648] E [MSGID: 106244] [glusterd.c:1764:init]
>> 0-management: creation of 1 listeners failed, continuing with succeeded
>> transport
>> [2019-01-15 20:17:00.529390] I [MSGID: 106513]
>> [glusterd-store.c:2240:glusterd_restore_op_version] 0-glusterd: retrieved
>> op-version: 40100
>> [2019-01-15 20:17:00.608354] I [MSGID: 106544]
>> [glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID:
>> d6bf51a7-c296-492f-8dac-e81efa9dd22d
>> [2019-01-15 20:17:00.650911] W [MSGID: 106425]
>> [glusterd-store.c:2643:glusterd_store_retrieve_bricks] 0-management: failed
>> to get statfs() call on brick /media/disk4/brick4 [No such file or
>> directory]
>>
>
> This means that underlying brick /media/disk4/brick4 doesn't exist. You
> already mentioned that you had replaced the faulty disk, but have you not
> mounted it yet?
>
>
>> [2019-01-15 20:17:00.691240] I [MSGID: 106498]
>> [glusterd-handler.c:3614:glusterd_friend_add_from_peerinfo] 0-management:
>> connect returned 0
>> [2019-01-15 20:17:00.691307] W [MSGID: 106061]
>> [glusterd-handler.c:3408:glusterd_transport_inet_options_build] 0-glusterd:
>> Failed to get tcp-user-timeout
>> [2019-01-15 20:17:00.691331] I [rpc-clnt.c:1059:rpc_clnt_connection_init]
>> 0-management: setting frame-timeout to 600
>> [2019-01-15 20:17:00.692547] E [MSGID: 106187]
>> [glusterd-store.c:4662:glusterd_resolve_all_bricks] 0-glusterd: resolve
>> brick failed in restore
>> [2019-01-15 20:17:00.692582] E [MSGID: 101019] [xlator.c:720:xlator_init]
>> 0-management: Initialization of volume 'management' failed, review your
>> volfile again
>> [2019-01-15 20:17:00.692597] E [MSGID: 101066]
>> [graph.c:367:glusterfs_graph_init] 0-management: initializing translator
>> failed
>> [2019-01-15 20:17:00.692607] E [MSGID: 101176]
>> [graph.c:738:glusterfs_graph_activate] 0-graph: init failed
>> [2019-01-15 20:17:00.693004] W [glusterfsd.c:1514:cleanup_and_exit]
>> (-->/usr/local/sbin/glusterd(glusterfs_volumes_init+0xc2) [0x409f52]
>> -->/usr/local/sbin/glusterd(glusterfs_process_volfp+0x151) [0x409e41]
>> -->/usr/local/sbin/glusterd(cleanup_and_exit+0x5f) [0x40942f] ) 0-:
>> received signum (-1), shutting down
>>
>>
>> On Wed, Jan 16, 2019 at 4:34 PM Atin Mukherjee 
>> wrote:
>>
>>> This is a case of partial write of a transaction and as the host ran out
>>> of space for the root partition where all the glusterd related
>>> configurations are persisted, the transaction couldn't be written and hence
>>> the new (replaced) brick's information wasn't persisted in the
>>> configuration. The workaround for this is to copy the content of
>>> /var/lib/glusterd/vols/gfs-tst/ from one of the nodes in the trusted
>>> storage pool to the node where glusterd service fails to come up and post
>>> that restarting the glusterd service should be able to make peer status
>>> reporting all nodes healthy and connected.
>>>
>>> On Wed, Jan 16, 2019 at 3:49 PM Amudhan P  wrote:
>>>
 Hi,

 In short, when I started glusterd service I am getting following error
 msg in the glusterd.log file in one server.
 what needs to be done?

 error logged in glusterd.log

 [2019-01-15 17:50:13.956053] I [MSGID: 100030] [glusterfsd.c:2741:main]
 0-/usr/

[Gluster-users] VolumeOpt Set fails of a freshly created volume

2019-01-16 Thread David Spisla
Dear Gluster Community,

i created a replica 4 volume from gluster-node1 on a 4-Node Cluster with
SSL/TLS network encryption . During setting the 'cluster.use-compound-fops'
option, i got the error:

$  volume set: failed: Commit failed on gluster-node2. Please check log
file for details.

Here is the glusterd.log from gluster-node1:

*[2019-01-15 15:18:36.813034] I [run.c:242:runner_log]
(-->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0xdad2a)
[0x7fc24d91cd2a]
-->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0xda81c)
[0x7fc24d91c81c] -->/usr/lib64/libglusterfs.so.0(runner_log+0x105)
[0x7fc253dce0b5] ) 0-management: Ran script:
/var/lib/glusterd/hooks/1/set/post/S30samba-set.sh
--volname=integration-archive1 -o cluster.use-compound-fops=on
--gd-workdir=/var/lib/glusterd*
[2019-01-15 15:18:36.821193] I [run.c:242:runner_log]
(-->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0xdad2a)
[0x7fc24d91cd2a]
-->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0xda81c)
[0x7fc24d91c81c] -->/usr/lib64/libglusterfs.so.0(runner_log+0x105)
[0x7fc253dce0b5] ) 0-management: Ran script:
/var/lib/glusterd/hooks/1/set/post/S32gluster_enable_shared_storage.sh
--volname=integration-archive1 -o cluster.use-compound-fops=on
--gd-workdir=/var/lib/glusterd
[2019-01-15 15:18:36.842383] W [socket.c:719:__socket_rwv] 0-management:
readv on 10.10.12.42:24007 failed (Input/output error)
*[2019-01-15 15:18:36.842415] E [socket.c:246:ssl_dump_error_stack]
0-management:   error:140943F2:SSL routines:ssl3_read_bytes:sslv3 alert
unexpected message*
The message "E [MSGID: 101191]
[event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch
handler" repeated 81 times between [2019-01-15 15:18:30.735508] and
[2019-01-15 15:18:36.808994]
[2019-01-15 15:18:36.842439] I [MSGID: 106004]
[glusterd-handler.c:6430:__glusterd_peer_rpc_notify] 0-management: Peer <
gluster-node2> (<02724bb6-cb34-4ec3-8306-c2950e0acf9b>), in state , has disconnected from glusterd.
[2019-01-15 15:18:36.842638] W
[glusterd-locks.c:795:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0x24349)
[0x7fc24d866349]
-->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0x2d950)
[0x7fc24d86f950]
-->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0xe0239)
[0x7fc24d922239] ) 0-management: Lock for vol archive1 not held
[2019-01-15 15:18:36.842656] W [MSGID: 106117]
[glusterd-handler.c:6451:__glusterd_peer_rpc_notify] 0-management: Lock not
released for archive1
[2019-01-15 15:18:36.842674] W
[glusterd-locks.c:795:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0x24349)
[0x7fc24d866349]
-->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0x2d950)
[0x7fc24d86f950]
-->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0xe0239)
[0x7fc24d922239] ) 0-management: Lock for vol archive2 not held
[2019-01-15 15:18:36.842680] W [MSGID: 106117]
[glusterd-handler.c:6451:__glusterd_peer_rpc_notify] 0-management: Lock not
released for archive2
[2019-01-15 15:18:36.842694] W
[glusterd-locks.c:795:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0x24349)
[0x7fc24d866349]
-->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0x2d950)
[0x7fc24d86f950]
-->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0xe0239)
[0x7fc24d922239] ) 0-management: Lock for vol gluster_shared_storage not
held
[2019-01-15 15:18:36.842702] W [MSGID: 106117]
[glusterd-handler.c:6451:__glusterd_peer_rpc_notify] 0-management: Lock not
released for gluster_shared_storage
[2019-01-15 15:18:36.842719] W
[glusterd-locks.c:806:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0x24349)
[0x7fc24d866349]
-->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0x2d950)
[0x7fc24d86f950]
-->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0xe0074)
[0x7fc24d922074] ) 0-management: Lock owner mismatch. Lock for vol
integration-archive1 held by ffdaa400-82cc-4ada-8ea7-144bf3714269
[2019-01-15 15:18:36.842727] W [MSGID: 106117]
[glusterd-handler.c:6451:__glusterd_peer_rpc_notify] 0-management: Lock not
released for integration-archive1
[2019-01-15 15:18:36.842970] E [rpc-clnt.c:346:saved_frames_unwind] (-->
/usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x17d)[0x7fc253d7f18d] (-->
/usr/lib64/libgfrpc.so.0(+0xca3d)[0x7fc253b46a3d] (-->
/usr/lib64/libgfrpc.so.0(+0xcb5e)[0x7fc253b46b5e] (-->
/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x8b)[0x7fc253b480bb]
(--> /usr/lib64/libgfrpc.so.0(+0xec68)[0x7fc253b48c68] ) 0-management:
forced unwinding frame type(glusterd mgmt) op(--(4)) called at 2019-01-15
15:18:36.802613 (xid=0x6da)
[2019-01-15 15:18:36.842994] E [MSGID: 106152]
[glusterd-syncop.c:104:gd_collate_errors] 0-glusterd: Commit failed on
gluster-node2. Please check log file for details.

And here glusterd.log from gluster-node2:

*[2019-01-15 15:18:36.901788] I [run.c:242:runner_log]
(-->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0xdad2a)
[0x7f9fba02cd2a]
-->/usr/lib64/glusterfs

Re: [Gluster-users] glusterfs 4.1.6 error in starting glusterd service

2019-01-16 Thread Atin Mukherjee
If gluster volume info/status shows the brick to be /media/disk4/brick4
then you'd need to mount the same path and hence you'd need to create the
brick4 directory explicitly. I fail to understand the rationale how only
/media/disk4 can be used as the mount path for the brick.

On Wed, Jan 16, 2019 at 5:24 PM Amudhan P  wrote:

> Yes, I did mount bricks but the folder 'brick4' was still not created
> inside the brick.
> Do I need to create this folder because when I run replace-brick it will
> create folder inside the brick. I have seen this behavior before when
> running replace-brick or heal begins.
>
> On Wed, Jan 16, 2019 at 5:05 PM Atin Mukherjee 
> wrote:
>
>>
>>
>> On Wed, Jan 16, 2019 at 5:02 PM Amudhan P  wrote:
>>
>>> Atin,
>>> I have copied the content of 'gfs-tst' from vol folder in another node.
>>> when starting service again fails with error msg in glusterd.log file.
>>>
>>> [2019-01-15 20:16:59.513023] I [MSGID: 100030] [glusterfsd.c:2741:main]
>>> 0-/usr/local/sbin/glusterd: Started running /usr/local/sbin/glusterd
>>> version 4.1.6 (args: /usr/local/sbin/glusterd -p /var/run/glusterd.pid)
>>> [2019-01-15 20:16:59.517164] I [MSGID: 106478] [glusterd.c:1423:init]
>>> 0-management: Maximum allowed open file descriptors set to 65536
>>> [2019-01-15 20:16:59.517264] I [MSGID: 106479] [glusterd.c:1481:init]
>>> 0-management: Using /var/lib/glusterd as working directory
>>> [2019-01-15 20:16:59.517283] I [MSGID: 106479] [glusterd.c:1486:init]
>>> 0-management: Using /var/run/gluster as pid file working directory
>>> [2019-01-15 20:16:59.521508] W [MSGID: 103071]
>>> [rdma.c:4629:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event
>>> channel creation failed [No such device]
>>> [2019-01-15 20:16:59.521544] W [MSGID: 103055] [rdma.c:4938:init]
>>> 0-rdma.management: Failed to initialize IB Device
>>> [2019-01-15 20:16:59.521562] W [rpc-transport.c:351:rpc_transport_load]
>>> 0-rpc-transport: 'rdma' initialization failed
>>> [2019-01-15 20:16:59.521629] W [rpcsvc.c:1781:rpcsvc_create_listener]
>>> 0-rpc-service: cannot create listener, initing the transport failed
>>> [2019-01-15 20:16:59.521648] E [MSGID: 106244] [glusterd.c:1764:init]
>>> 0-management: creation of 1 listeners failed, continuing with succeeded
>>> transport
>>> [2019-01-15 20:17:00.529390] I [MSGID: 106513]
>>> [glusterd-store.c:2240:glusterd_restore_op_version] 0-glusterd: retrieved
>>> op-version: 40100
>>> [2019-01-15 20:17:00.608354] I [MSGID: 106544]
>>> [glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID:
>>> d6bf51a7-c296-492f-8dac-e81efa9dd22d
>>> [2019-01-15 20:17:00.650911] W [MSGID: 106425]
>>> [glusterd-store.c:2643:glusterd_store_retrieve_bricks] 0-management: failed
>>> to get statfs() call on brick /media/disk4/brick4 [No such file or
>>> directory]
>>>
>>
>> This means that underlying brick /media/disk4/brick4 doesn't exist. You
>> already mentioned that you had replaced the faulty disk, but have you not
>> mounted it yet?
>>
>>
>>> [2019-01-15 20:17:00.691240] I [MSGID: 106498]
>>> [glusterd-handler.c:3614:glusterd_friend_add_from_peerinfo] 0-management:
>>> connect returned 0
>>> [2019-01-15 20:17:00.691307] W [MSGID: 106061]
>>> [glusterd-handler.c:3408:glusterd_transport_inet_options_build] 0-glusterd:
>>> Failed to get tcp-user-timeout
>>> [2019-01-15 20:17:00.691331] I
>>> [rpc-clnt.c:1059:rpc_clnt_connection_init] 0-management: setting
>>> frame-timeout to 600
>>> [2019-01-15 20:17:00.692547] E [MSGID: 106187]
>>> [glusterd-store.c:4662:glusterd_resolve_all_bricks] 0-glusterd: resolve
>>> brick failed in restore
>>> [2019-01-15 20:17:00.692582] E [MSGID: 101019]
>>> [xlator.c:720:xlator_init] 0-management: Initialization of volume
>>> 'management' failed, review your volfile again
>>> [2019-01-15 20:17:00.692597] E [MSGID: 101066]
>>> [graph.c:367:glusterfs_graph_init] 0-management: initializing translator
>>> failed
>>> [2019-01-15 20:17:00.692607] E [MSGID: 101176]
>>> [graph.c:738:glusterfs_graph_activate] 0-graph: init failed
>>> [2019-01-15 20:17:00.693004] W [glusterfsd.c:1514:cleanup_and_exit]
>>> (-->/usr/local/sbin/glusterd(glusterfs_volumes_init+0xc2) [0x409f52]
>>> -->/usr/local/sbin/glusterd(glusterfs_process_volfp+0x151) [0x409e41]
>>> -->/usr/local/sbin/glusterd(cleanup_and_exit+0x5f) [0x40942f] ) 0-:
>>> received signum (-1), shutting down
>>>
>>>
>>> On Wed, Jan 16, 2019 at 4:34 PM Atin Mukherjee 
>>> wrote:
>>>
 This is a case of partial write of a transaction and as the host ran
 out of space for the root partition where all the glusterd related
 configurations are persisted, the transaction couldn't be written and hence
 the new (replaced) brick's information wasn't persisted in the
 configuration. The workaround for this is to copy the content of
 /var/lib/glusterd/vols/gfs-tst/ from one of the nodes in the trusted
 storage pool to the node where glusterd service fails to come up and post
 that restarting the glusterd service should be ab

Re: [Gluster-users] VolumeOpt Set fails of a freshly created volume

2019-01-16 Thread Atin Mukherjee
On Wed, Jan 16, 2019 at 9:48 PM David Spisla  wrote:

> Dear Gluster Community,
>
> i created a replica 4 volume from gluster-node1 on a 4-Node Cluster with
> SSL/TLS network encryption . During setting the 'cluster.use-compound-fops'
> option, i got the error:
>
> $  volume set: failed: Commit failed on gluster-node2. Please check log
> file for details.
>
> Here is the glusterd.log from gluster-node1:
>
> *[2019-01-15 15:18:36.813034] I [run.c:242:runner_log]
> (-->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0xdad2a)
> [0x7fc24d91cd2a]
> -->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0xda81c)
> [0x7fc24d91c81c] -->/usr/lib64/libglusterfs.so.0(runner_log+0x105)
> [0x7fc253dce0b5] ) 0-management: Ran script:
> /var/lib/glusterd/hooks/1/set/post/S30samba-set.sh
> --volname=integration-archive1 -o cluster.use-compound-fops=on
> --gd-workdir=/var/lib/glusterd*
> [2019-01-15 15:18:36.821193] I [run.c:242:runner_log]
> (-->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0xdad2a)
> [0x7fc24d91cd2a]
> -->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0xda81c)
> [0x7fc24d91c81c] -->/usr/lib64/libglusterfs.so.0(runner_log+0x105)
> [0x7fc253dce0b5] ) 0-management: Ran script:
> /var/lib/glusterd/hooks/1/set/post/S32gluster_enable_shared_storage.sh
> --volname=integration-archive1 -o cluster.use-compound-fops=on
> --gd-workdir=/var/lib/glusterd
> [2019-01-15 15:18:36.842383] W [socket.c:719:__socket_rwv] 0-management:
> readv on 10.10.12.42:24007 failed (Input/output error)
> *[2019-01-15 15:18:36.842415] E [socket.c:246:ssl_dump_error_stack]
> 0-management:   error:140943F2:SSL routines:ssl3_read_bytes:sslv3 alert
> unexpected message*
> The message "E [MSGID: 101191]
> [event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch
> handler" repeated 81 times between [2019-01-15 15:18:30.735508] and
> [2019-01-15 15:18:36.808994]
> [2019-01-15 15:18:36.842439] I [MSGID: 106004]
> [glusterd-handler.c:6430:__glusterd_peer_rpc_notify] 0-management: Peer <
> gluster-node2> (<02724bb6-cb34-4ec3-8306-c2950e0acf9b>), in state  in Cluster>, has disconnected from glusterd.
>

The above shows there was a peer disconnect event received from
gluster-node2 and this sequence might have happened while the commit
operation was in-flight and hence the volume set failed on gluster-node2.
Related to ssl error, I'd request Milind to comment.

[2019-01-15 15:18:36.842638] W
> [glusterd-locks.c:795:glusterd_mgmt_v3_unlock]
> (-->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0x24349)
> [0x7fc24d866349]
> -->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0x2d950)
> [0x7fc24d86f950]
> -->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0xe0239)
> [0x7fc24d922239] ) 0-management: Lock for vol archive1 not held
> [2019-01-15 15:18:36.842656] W [MSGID: 106117]
> [glusterd-handler.c:6451:__glusterd_peer_rpc_notify] 0-management: Lock not
> released for archive1
> [2019-01-15 15:18:36.842674] W
> [glusterd-locks.c:795:glusterd_mgmt_v3_unlock]
> (-->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0x24349)
> [0x7fc24d866349]
> -->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0x2d950)
> [0x7fc24d86f950]
> -->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0xe0239)
> [0x7fc24d922239] ) 0-management: Lock for vol archive2 not held
> [2019-01-15 15:18:36.842680] W [MSGID: 106117]
> [glusterd-handler.c:6451:__glusterd_peer_rpc_notify] 0-management: Lock not
> released for archive2
> [2019-01-15 15:18:36.842694] W
> [glusterd-locks.c:795:glusterd_mgmt_v3_unlock]
> (-->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0x24349)
> [0x7fc24d866349]
> -->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0x2d950)
> [0x7fc24d86f950]
> -->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0xe0239)
> [0x7fc24d922239] ) 0-management: Lock for vol gluster_shared_storage not
> held
> [2019-01-15 15:18:36.842702] W [MSGID: 106117]
> [glusterd-handler.c:6451:__glusterd_peer_rpc_notify] 0-management: Lock not
> released for gluster_shared_storage
> [2019-01-15 15:18:36.842719] W
> [glusterd-locks.c:806:glusterd_mgmt_v3_unlock]
> (-->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0x24349)
> [0x7fc24d866349]
> -->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0x2d950)
> [0x7fc24d86f950]
> -->/usr/lib64/glusterfs/5.2/xlator/mgmt/glusterd.so(+0xe0074)
> [0x7fc24d922074] ) 0-management: Lock owner mismatch. Lock for vol
> integration-archive1 held by ffdaa400-82cc-4ada-8ea7-144bf3714269
> [2019-01-15 15:18:36.842727] W [MSGID: 106117]
> [glusterd-handler.c:6451:__glusterd_peer_rpc_notify] 0-management: Lock not
> released for integration-archive1
> [2019-01-15 15:18:36.842970] E [rpc-clnt.c:346:saved_frames_unwind] (-->
> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x17d)[0x7fc253d7f18d] (-->
> /usr/lib64/libgfrpc.so.0(+0xca3d)[0x7fc253b46a3d] (-->
> /usr/lib64/libgfrpc.so.0(+0xcb5e)[0x7fc253b46b5e] (-->
> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x8b)[0x7fc253b480bb]
> (--> /usr/lib64/libgfrpc.so.0(+0xec68

Re: [Gluster-users] glusterfs 4.1.6 error in starting glusterd service

2019-01-16 Thread Amudhan P
I have created the folder in the path as said but still, service failed to
start below is the error msg in glusterd.log

[2019-01-16 14:50:14.555742] I [MSGID: 100030] [glusterfsd.c:2741:main]
0-/usr/local/sbin/glusterd: Started running /usr/local/sbin/glusterd
version 4.1.6 (args: /usr/local/sbin/glusterd -p /var/run/glusterd.pid)
[2019-01-16 14:50:14.559835] I [MSGID: 106478] [glusterd.c:1423:init]
0-management: Maximum allowed open file descriptors set to 65536
[2019-01-16 14:50:14.559894] I [MSGID: 106479] [glusterd.c:1481:init]
0-management: Using /var/lib/glusterd as working directory
[2019-01-16 14:50:14.559912] I [MSGID: 106479] [glusterd.c:1486:init]
0-management: Using /var/run/gluster as pid file working directory
[2019-01-16 14:50:14.563834] W [MSGID: 103071]
[rdma.c:4629:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event
channel creation failed [No such device]
[2019-01-16 14:50:14.563867] W [MSGID: 103055] [rdma.c:4938:init]
0-rdma.management: Failed to initialize IB Device
[2019-01-16 14:50:14.563882] W [rpc-transport.c:351:rpc_transport_load]
0-rpc-transport: 'rdma' initialization failed
[2019-01-16 14:50:14.563957] W [rpcsvc.c:1781:rpcsvc_create_listener]
0-rpc-service: cannot create listener, initing the transport failed
[2019-01-16 14:50:14.563974] E [MSGID: 106244] [glusterd.c:1764:init]
0-management: creation of 1 listeners failed, continuing with succeeded
transport
[2019-01-16 14:50:15.565868] I [MSGID: 106513]
[glusterd-store.c:2240:glusterd_restore_op_version] 0-glusterd: retrieved
op-version: 40100
[2019-01-16 14:50:15.642532] I [MSGID: 106544]
[glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID:
d6bf51a7-c296-492f-8dac-e81efa9dd22d
[2019-01-16 14:50:15.675333] I [MSGID: 106498]
[glusterd-handler.c:3614:glusterd_friend_add_from_peerinfo] 0-management:
connect returned 0
[2019-01-16 14:50:15.675421] W [MSGID: 106061]
[glusterd-handler.c:3408:glusterd_transport_inet_options_build] 0-glusterd:
Failed to get tcp-user-timeout
[2019-01-16 14:50:15.675451] I [rpc-clnt.c:1059:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
*[2019-01-16 14:50:15.676912] E [MSGID: 106187]
[glusterd-store.c:4662:glusterd_resolve_all_bricks] 0-glusterd: resolve
brick failed in restore*
*[2019-01-16 14:50:15.676956] E [MSGID: 101019] [xlator.c:720:xlator_init]
0-management: Initialization of volume 'management' failed, review your
volfile again*
[2019-01-16 14:50:15.676973] E [MSGID: 101066]
[graph.c:367:glusterfs_graph_init] 0-management: initializing translator
failed
[2019-01-16 14:50:15.676986] E [MSGID: 101176]
[graph.c:738:glusterfs_graph_activate] 0-graph: init failed
[2019-01-16 14:50:15.677479] W [glusterfsd.c:1514:cleanup_and_exit]
(-->/usr/local/sbin/glusterd(glusterfs_volumes_init+0xc2) [0x409f52]
-->/usr/local/sbin/glusterd(glusterfs_process_volfp+0x151) [0x409e41]
-->/usr/local/sbin/glusterd(cleanup_and_exit+0x5f) [0x40942f] ) 0-:
received signum (-1), shutting down


On Thu, Jan 17, 2019 at 8:06 AM Atin Mukherjee  wrote:

> If gluster volume info/status shows the brick to be /media/disk4/brick4
> then you'd need to mount the same path and hence you'd need to create the
> brick4 directory explicitly. I fail to understand the rationale how only
> /media/disk4 can be used as the mount path for the brick.
>
> On Wed, Jan 16, 2019 at 5:24 PM Amudhan P  wrote:
>
>> Yes, I did mount bricks but the folder 'brick4' was still not created
>> inside the brick.
>> Do I need to create this folder because when I run replace-brick it will
>> create folder inside the brick. I have seen this behavior before when
>> running replace-brick or heal begins.
>>
>> On Wed, Jan 16, 2019 at 5:05 PM Atin Mukherjee 
>> wrote:
>>
>>>
>>>
>>> On Wed, Jan 16, 2019 at 5:02 PM Amudhan P  wrote:
>>>
 Atin,
 I have copied the content of 'gfs-tst' from vol folder in another node.
 when starting service again fails with error msg in glusterd.log file.

 [2019-01-15 20:16:59.513023] I [MSGID: 100030] [glusterfsd.c:2741:main]
 0-/usr/local/sbin/glusterd: Started running /usr/local/sbin/glusterd
 version 4.1.6 (args: /usr/local/sbin/glusterd -p /var/run/glusterd.pid)
 [2019-01-15 20:16:59.517164] I [MSGID: 106478] [glusterd.c:1423:init]
 0-management: Maximum allowed open file descriptors set to 65536
 [2019-01-15 20:16:59.517264] I [MSGID: 106479] [glusterd.c:1481:init]
 0-management: Using /var/lib/glusterd as working directory
 [2019-01-15 20:16:59.517283] I [MSGID: 106479] [glusterd.c:1486:init]
 0-management: Using /var/run/gluster as pid file working directory
 [2019-01-15 20:16:59.521508] W [MSGID: 103071]
 [rdma.c:4629:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event
 channel creation failed [No such device]
 [2019-01-15 20:16:59.521544] W [MSGID: 103055] [rdma.c:4938:init]
 0-rdma.management: Failed to initialize IB Device
 [2019-01-15 20:16:59.521562] W [rpc-transport.c:351:rpc_transport_load]
 0-r