[Gluster-users] Heal command stopped
I recently added a new replica server and have now: Number of Bricks: 1 x 2 = 2 The heal was launched automatically and was working until yesterday (copied 5.5TB of files from total of 6.2TB). Now, the copy seems stopped, I do not see any file change on the new replica brick server. When trying to add a new file to the volume and checking the physical files on the replica brick, the file is not there. When I try to run a full heal with the command: sudo gluster volume heal storage full I am getting: Launching heal operation to perform full self heal on volume storage has been unsuccessful on bricks that are down. Please check if all brick processes are running. My storage info shows both bricks there. Any idea? -- - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculo...@mdpi.com Skype: milos.cuculovic.mdpi ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Fwd: Replica brick not working
All was fixed now, thank you very much. I had to completely purge the storage2 server for all glusterfs files. The entire procedure is as follows: How to add a new replica volume 1. Purge all glusterfs packages on new server 2. Remove all from /var/log/glusterfs and /var/lib/glusterfs 3. Install fresh glusterfs-server installation 4. On new server, start glusterfs-server 5. On old server, run gluster peer probe storage2 6. On both servers, check gluster peer status 7. If it is 'Peer in cluster' on both, then you're good. 8. On old server, run sudo gluster volume add-brick storage replica 2 storage2:/data/data-cluster 9. Return message is volume add-brick: success 10. Healing will start automatically, check the status with gluster volume heal storage info - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculo...@mdpi.com Skype: milos.cuculovic.mdpi On 08.12.2016 18:58, Pranith Kumar Karampuri wrote: On Thu, Dec 8, 2016 at 11:25 PM, Pranith Kumar Karampuri mailto:pkara...@redhat.com>> wrote: On Thu, Dec 8, 2016 at 11:17 PM, Pranith Kumar Karampuri mailto:pkara...@redhat.com>> wrote: On Thu, Dec 8, 2016 at 10:22 PM, Ravishankar N mailto:ravishan...@redhat.com>> wrote: On 12/08/2016 09:44 PM, Miloš Čučulović - MDPI wrote: I was able to fix the sync by rsync-ing all the directories, then the hale started. The next problem :), as soon as there are files on the new brick, the gluster mount will render also this one for mounts, and the new brick is not ready yet, as the sync is not yet done, so it results on missing files on client side. I temporary removed the new brick, now I am running a manual rsync and will add the brick again, hope this could work. What mechanism is managing this issue, I guess there is something per built to make a replica brick available only once the data is completely synced. This mechanism was introduced in 3.7.9 or 3.7.10 (http://review.gluster.org/#/c/13806/ <http://review.gluster.org/#/c/13806/>). Before that version, you manually needed to set some xattrs on the bricks so that healing could happen in parallel while the client still would server reads from the original brick. I can't find the link to the doc which describes these steps for setting xattrs.:-( https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Managing%20Volumes/#replace-brick <https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Managing%20Volumes/#replace-brick> Oh this is addition of bricks? Just do the following: 1) Bring the new brick down by killing it. 2) On the root of the mount directory(Let's call it /mnt) do: mkdir /mnt/ rmdir /mnt/ setfattr -n trusted.non-existent-key -v abc /mnt setfattr -x trusted.non-existent-key /mnt 3) Start the volume using: "gluster volume start force" This will trigger the heal which will make sure everything is healed and the application will only see the correct data. Since you did an explicit rsync, there is no gurantee that things should work as expected. We will be adding the steps above to documentation. Please note that you need to do these steps exactly, If you do the mkdir/rmdir/setfattr steps by bringing the good brick, reverse heal will happen and the data will be removed. Calling it a day, Ravi - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculo...@mdpi.com <mailto:cuculo...@mdpi.com> Skype: milos.cuculovic.mdpi On 08.12.2016 16:17, Ravishankar N wrote: On 12/08/2016 06:53 PM, Atin Mukherjee wrote: On Thu, Dec 8, 2016 at 6:44 PM, Miloš Čučulović - MDPI mailto:cuculo...@mdpi.com> <mailto:cuculo...@mdpi.com <mailto:cuculo...@mdpi.com>>> wrote: Ah, damn! I found the issue. On the storage server, the storage2 IP address was wrong, I inversed two digits in the /etc/hosts
Re: [Gluster-users] Fwd: Replica brick not working
Atin, I was able to move forward a bit. Initially, I had this: sudo gluster peer status Number of Peers: 1 Hostname: storage2 Uuid: 32bef70a-9e31-403e-b9f3-ec9e1bd162ad State: Peer Rejected (Connected) Then, on storage2 I removed all from /var/lib/glusterd except the info file. Now I am getting another error message: sudo gluster peer status Number of Peers: 1 Hostname: storage2 Uuid: 32bef70a-9e31-403e-b9f3-ec9e1bd162ad State: Sent and Received peer request (Connected) But the add brick is still not working. I checked the hosts file and all seems ok, ping is also working well. The think I also need to know, when adding a new replicated brick, do I need to first sync all files, or the new brick server needs to be empty? Also, do I first need to create the same volume on the new server or adding it to the volume of server1 will do it automatically? - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculo...@mdpi.com Skype: milos.cuculovic.mdpi On 14.12.2016 05:13, Atin Mukherjee wrote: Milos, I just managed to take a look into a similar issue and my analysis is at [1]. I remember you mentioning about some incorrect /etc/hosts entries which lead to this same problem in earlier case, do you mind to recheck the same? [1] http://www.gluster.org/pipermail/gluster-users/2016-December/029443.html On Wed, Dec 14, 2016 at 2:57 AM, Miloš Čučulović - MDPI mailto:cuculo...@mdpi.com>> wrote: Hi All, Moving forward with my issue, sorry for the late reply! I had some issues with the storage2 server (original volume), then decided to use 3.9.0, si I have the latest version. For that, I synced manually all the files to the storage server. I installed there gluster 3.9.0, started it, created new volume called storage and all seems to work ok. Now, I need to create my replicated volume (add new brick on storage2 server). Almost all the files are there. So, I was adding on storage server: * sudo gluter peer probe storage2 * sudo gluster volume add-brick storage replica 2 storage2:/data/data-cluster force But there I am receiving "volume add-brick: failed: Host storage2 is not in 'Peer in Cluster' state" Any idea? - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculo...@mdpi.com <mailto:cuculo...@mdpi.com> Skype: milos.cuculovic.mdpi On 08.12.2016 17:52, Ravishankar N wrote: On 12/08/2016 09:44 PM, Miloš Čučulović - MDPI wrote: I was able to fix the sync by rsync-ing all the directories, then the hale started. The next problem :), as soon as there are files on the new brick, the gluster mount will render also this one for mounts, and the new brick is not ready yet, as the sync is not yet done, so it results on missing files on client side. I temporary removed the new brick, now I am running a manual rsync and will add the brick again, hope this could work. What mechanism is managing this issue, I guess there is something per built to make a replica brick available only once the data is completely synced. This mechanism was introduced in 3.7.9 or 3.7.10 (http://review.gluster.org/#/c/13806/ <http://review.gluster.org/#/c/13806/>). Before that version, you manually needed to set some xattrs on the bricks so that healing could happen in parallel while the client still would server reads from the original brick. I can't find the link to the doc which describes these steps for setting xattrs.:-( Calling it a day, Ravi - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculo...@mdpi.com <mailto:cuculo...@mdpi.com> Skype: milos.cuculovic.mdpi On 08.12.2016 16:17, Ravishankar N wrote: On 12/08/2016 06:53 PM, Atin Mukherjee wrote: On Thu, Dec 8, 2016 at 6:44 PM, Miloš Čučulović - MDPI mailto:cuculo...@mdpi.com> <mailto:cuculo...@mdpi.com <mailto:cuculo...@mdpi.com>>> wrote: Ah, damn! I found
Re: [Gluster-users] Fwd: Replica brick not working
Hi All, Moving forward with my issue, sorry for the late reply! I had some issues with the storage2 server (original volume), then decided to use 3.9.0, si I have the latest version. For that, I synced manually all the files to the storage server. I installed there gluster 3.9.0, started it, created new volume called storage and all seems to work ok. Now, I need to create my replicated volume (add new brick on storage2 server). Almost all the files are there. So, I was adding on storage server: * sudo gluter peer probe storage2 * sudo gluster volume add-brick storage replica 2 storage2:/data/data-cluster force But there I am receiving "volume add-brick: failed: Host storage2 is not in 'Peer in Cluster' state" Any idea? - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculo...@mdpi.com Skype: milos.cuculovic.mdpi On 08.12.2016 17:52, Ravishankar N wrote: On 12/08/2016 09:44 PM, Miloš Čučulović - MDPI wrote: I was able to fix the sync by rsync-ing all the directories, then the hale started. The next problem :), as soon as there are files on the new brick, the gluster mount will render also this one for mounts, and the new brick is not ready yet, as the sync is not yet done, so it results on missing files on client side. I temporary removed the new brick, now I am running a manual rsync and will add the brick again, hope this could work. What mechanism is managing this issue, I guess there is something per built to make a replica brick available only once the data is completely synced. This mechanism was introduced in 3.7.9 or 3.7.10 (http://review.gluster.org/#/c/13806/). Before that version, you manually needed to set some xattrs on the bricks so that healing could happen in parallel while the client still would server reads from the original brick. I can't find the link to the doc which describes these steps for setting xattrs.:-( Calling it a day, Ravi - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculo...@mdpi.com Skype: milos.cuculovic.mdpi On 08.12.2016 16:17, Ravishankar N wrote: On 12/08/2016 06:53 PM, Atin Mukherjee wrote: On Thu, Dec 8, 2016 at 6:44 PM, Miloš Čučulović - MDPI mailto:cuculo...@mdpi.com>> wrote: Ah, damn! I found the issue. On the storage server, the storage2 IP address was wrong, I inversed two digits in the /etc/hosts file, sorry for that :( I was able to add the brick now, I started the heal, but still no data transfer visible. 1. Are the files getting created on the new brick though? 2. Can you provide the output of `getfattr -d -m . -e hex /data/data-cluster` on both bricks? 3. Is it possible to attach gdb to the self-heal daemon on the original (old) brick and get a backtrace? `gdb -p ` thread apply all bt -->share this output quit gdb. -Ravi @Ravi/Pranith - can you help here? By doing gluster volume status, I have Status of volume: storage Gluster process TCP Port RDMA Port Online Pid -- Brick storage2:/data/data-cluster 49152 0 Y 23101 Brick storage:/data/data-cluster 49152 0 Y 30773 Self-heal Daemon on localhost N/A N/A Y 30050 Self-heal Daemon on storage N/A N/A Y 30792 Any idea? On storage I have: Number of Peers: 1 Hostname: 195.65.194.217 Uuid: 7c988af2-9f76-4843-8e6f-d94866d57bb0 State: Peer in Cluster (Connected) - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculo...@mdpi.com <mailto:cuculo...@mdpi.com> Skype: milos.cuculovic.mdpi On 08.12.2016 13:55, Atin Mukherjee wrote: Can you resend the attachment as zip? I am unable to extract the content? We shouldn't have 0 info file. What does gluster peer status output say? On Thu, Dec 8, 2016 at 4:51 PM, Miloš Čučulović - MDPI mailto:cuculo...@mdpi.com> <mailto:cuculo...@mdpi.com <mailto:cuculo...@mdpi.com>>> wrote: I hope you received my last email Atin, thank you! - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculo
Re: [Gluster-users] Fwd: Replica brick not working
I was able to fix the sync by rsync-ing all the directories, then the hale started. The next problem :), as soon as there are files on the new brick, the gluster mount will render also this one for mounts, and the new brick is not ready yet, as the sync is not yet done, so it results on missing files on client side. I temporary removed the new brick, now I am running a manual rsync and will add the brick again, hope this could work. What mechanism is managing this issue, I guess there is something per built to make a replica brick available only once the data is completely synced. - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculo...@mdpi.com Skype: milos.cuculovic.mdpi On 08.12.2016 16:17, Ravishankar N wrote: On 12/08/2016 06:53 PM, Atin Mukherjee wrote: On Thu, Dec 8, 2016 at 6:44 PM, Miloš Čučulović - MDPI mailto:cuculo...@mdpi.com>> wrote: Ah, damn! I found the issue. On the storage server, the storage2 IP address was wrong, I inversed two digits in the /etc/hosts file, sorry for that :( I was able to add the brick now, I started the heal, but still no data transfer visible. 1. Are the files getting created on the new brick though? 2. Can you provide the output of `getfattr -d -m . -e hex /data/data-cluster` on both bricks? 3. Is it possible to attach gdb to the self-heal daemon on the original (old) brick and get a backtrace? `gdb -p ` thread apply all bt -->share this output quit gdb. -Ravi @Ravi/Pranith - can you help here? By doing gluster volume status, I have Status of volume: storage Gluster process TCP Port RDMA Port Online Pid -- Brick storage2:/data/data-cluster 49152 0 Y 23101 Brick storage:/data/data-cluster 49152 0 Y 30773 Self-heal Daemon on localhost N/A N/AY 30050 Self-heal Daemon on storage N/A N/AY 30792 Any idea? On storage I have: Number of Peers: 1 Hostname: 195.65.194.217 Uuid: 7c988af2-9f76-4843-8e6f-d94866d57bb0 State: Peer in Cluster (Connected) - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculo...@mdpi.com <mailto:cuculo...@mdpi.com> Skype: milos.cuculovic.mdpi On 08.12.2016 13:55, Atin Mukherjee wrote: Can you resend the attachment as zip? I am unable to extract the content? We shouldn't have 0 info file. What does gluster peer status output say? On Thu, Dec 8, 2016 at 4:51 PM, Miloš Čučulović - MDPI mailto:cuculo...@mdpi.com> <mailto:cuculo...@mdpi.com <mailto:cuculo...@mdpi.com>>> wrote: I hope you received my last email Atin, thank you! - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculo...@mdpi.com <mailto:cuculo...@mdpi.com> <mailto:cuculo...@mdpi.com <mailto:cuculo...@mdpi.com>> Skype: milos.cuculovic.mdpi On 08.12.2016 10:28, Atin Mukherjee wrote: -- Forwarded message -- From: *Atin Mukherjee* mailto:amukh...@redhat.com> <mailto:amukh...@redhat.com <mailto:amukh...@redhat.com>> <mailto:amukh...@redhat.com <mailto:amukh...@redhat.com> <mailto:amukh...@redhat.com <mailto:amukh...@redhat.com>>>> Date: Thu, Dec 8, 2016 at 11:56 AM Subject: Re: [Gluster-users] Replica brick not working To: Ravishankar N mailto:ravishan...@redhat.com> <mailto:ravishan...@redhat.com <mailto:ravishan...@redhat.com>> <mailto:ravishan...@redhat.com <mailto:ravishan...@redhat.com> <mailto:ravishan...@redhat.com <mailto:ravishan...@redhat.com>>>> Cc: Miloš Čučulović - MDPI mailto:cuculo...@mdpi.com> <mailto:cuculo...@mdpi.com <mailto:cuculo...@mdpi.com>> <mailto:cuculo...@mdpi.com <mailto:cuculo...@mdpi.com> <mailto:cuculo...@mdpi.com <mailto:cuculo...@mdpi.com>>>>, Pranith Kumar Kar
Re: [Gluster-users] Fwd: Replica brick not working
Additional info, there are warning / errors in the new brick: [2016-12-08 15:37:05.053615] E [MSGID: 115056] [server-rpc-fops.c:509:server_mkdir_cbk] 0-storage-server: 12636867: MKDIR /dms (----0001/dms) ==> (Permission denied) [Permission denied] [2016-12-08 15:37:05.135607] I [MSGID: 115081] [server-rpc-fops.c:1280:server_fstat_cbk] 0-storage-server: 12636895: FSTAT -2 (e9481d78-9094-45a7-ac7e-e1feeb7055df) ==> (No such file or directory) [No such file or directory] [2016-12-08 15:37:05.163610] I [MSGID: 115081] [server-rpc-fops.c:1280:server_fstat_cbk] 0-storage-server: 3523605: FSTAT -2 (2bb87992-5f24-44bd-ba7c-70c84510942b) ==> (No such file or directory) [No such file or directory] [2016-12-08 15:37:05.163633] I [MSGID: 115081] [server-rpc-fops.c:1280:server_fstat_cbk] 0-storage-server: 3523604: FSTAT -2 (2bb87992-5f24-44bd-ba7c-70c84510942b) ==> (No such file or directory) [No such file or directory] [2016-12-08 15:37:05.166590] I [MSGID: 115081] [server-rpc-fops.c:1280:server_fstat_cbk] 0-storage-server: 3523619: FSTAT -2 (616028b7-a2c2-40e3-998a-68329daf7b07) ==> (No such file or directory) [No such file or directory] [2016-12-08 15:37:05.166659] I [MSGID: 115081] [server-rpc-fops.c:1280:server_fstat_cbk] 0-storage-server: 3523620: FSTAT -2 (616028b7-a2c2-40e3-998a-68329daf7b07) ==> (No such file or directory) [No such file or directory] [2016-12-08 15:37:05.241276] I [MSGID: 115081] [server-rpc-fops.c:1280:server_fstat_cbk] 0-storage-server: 3451382: FSTAT -2 (f00e597e-7ae4-4d3a-986e-bbeb6cc07339) ==> (No such file or directory) [No such file or directory] [2016-12-08 15:37:05.268583] I [MSGID: 115081] [server-rpc-fops.c:1280:server_fstat_cbk] 0-storage-server: 3523823: FSTAT -2 (a8a343c1-512f-4ad1-a3db-de9fc8ed990c) ==> (No such file or directory) [No such file or directory] [2016-12-08 15:37:05.268771] I [MSGID: 115081] [server-rpc-fops.c:1280:server_fstat_cbk] 0-storage-server: 3523824: FSTAT -2 (a8a343c1-512f-4ad1-a3db-de9fc8ed990c) ==> (No such file or directory) [No such file or directory] [2016-12-08 15:37:05.302501] I [MSGID: 115081] [server-rpc-fops.c:1280:server_fstat_cbk] 0-storage-server: 3523868: FSTAT -2 (eb0c4500-f9ae-408a-85e6-6e67ec6466a9) ==> (No such file or directory) [No such file or directory] [2016-12-08 15:37:05.302558] I [MSGID: 115081] [server-rpc-fops.c:1280:server_fstat_cbk] 0-storage-server: 3523869: FSTAT -2 (eb0c4500-f9ae-408a-85e6-6e67ec6466a9) ==> (No such file or directory) [No such file or directory] [2016-12-08 15:37:05.365428] E [MSGID: 115056] [server-rpc-fops.c:509:server_mkdir_cbk] 0-storage-server: 12637038: MKDIR /files (----0001/files) ==> (Permission denied) [Permission denied] [2016-12-08 15:37:05.414486] E [MSGID: 115056] [server-rpc-fops.c:509:server_mkdir_cbk] 0-storage-server: 3451430: MKDIR /files (----0001/files) ==> (Permission denied) [Permission denied] - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculo...@mdpi.com Skype: milos.cuculovic.mdpi On 08.12.2016 16:32, Miloš Čučulović - MDPI wrote: 1. No, atm the old server (storage2) volume is mounted on some other servers, so all files are created there. If I check the new brick, there is no files. 2. On storage2 server (old brick) getfattr: Removing leading '/' from absolute path names # file: data/data-cluster trusted.gfid=0x0001 trusted.glusterfs.dht=0x0001 trusted.glusterfs.volume-id=0x0226135726f346bcb3f8cb73365ed382 On storage server (new brick) getfattr: Removing leading '/' from absolute path names # file: data/data-cluster trusted.gfid=0x0001 trusted.glusterfs.dht=0x0001 trusted.glusterfs.volume-id=0x0226135726f346bcb3f8cb73365ed382 3. Thread 8 (Thread 0x7fad832dd700 (LWP 30057)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225 #1 0x7fad88834f3e in __afr_shd_healer_wait () from /usr/lib/x86_64-linux-gnu/glusterfs/3.7.6/xlator/cluster/replicate.so #2 0x7fad88834fad in afr_shd_healer_wait () from /usr/lib/x86_64-linux-gnu/glusterfs/3.7.6/xlator/cluster/replicate.so #3 0x7fad88835aa0 in afr_shd_index_healer () from /usr/lib/x86_64-linux-gnu/glusterfs/3.7.6/xlator/cluster/replicate.so #4 0x7fad8df4270a in start_thread (arg=0x7fad832dd700) at pthread_create.c:333 #5 0x7fad8dc7882d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 Thread 7 (Thread 0x7fad83ade700 (LWP 30056)): #0 0x7fad8dc78e23 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84 #1 0x7fad8e808a58 in ?? () from /usr/lib/x86_64-linux-gnu/libglusterfs.so.0 #2
Re: [Gluster-users] Fwd: Replica brick not working
1. No, atm the old server (storage2) volume is mounted on some other servers, so all files are created there. If I check the new brick, there is no files. 2. On storage2 server (old brick) getfattr: Removing leading '/' from absolute path names # file: data/data-cluster trusted.gfid=0x0001 trusted.glusterfs.dht=0x0001 trusted.glusterfs.volume-id=0x0226135726f346bcb3f8cb73365ed382 On storage server (new brick) getfattr: Removing leading '/' from absolute path names # file: data/data-cluster trusted.gfid=0x0001 trusted.glusterfs.dht=0x0001 trusted.glusterfs.volume-id=0x0226135726f346bcb3f8cb73365ed382 3. Thread 8 (Thread 0x7fad832dd700 (LWP 30057)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225 #1 0x7fad88834f3e in __afr_shd_healer_wait () from /usr/lib/x86_64-linux-gnu/glusterfs/3.7.6/xlator/cluster/replicate.so #2 0x7fad88834fad in afr_shd_healer_wait () from /usr/lib/x86_64-linux-gnu/glusterfs/3.7.6/xlator/cluster/replicate.so #3 0x7fad88835aa0 in afr_shd_index_healer () from /usr/lib/x86_64-linux-gnu/glusterfs/3.7.6/xlator/cluster/replicate.so #4 0x7fad8df4270a in start_thread (arg=0x7fad832dd700) at pthread_create.c:333 #5 0x7fad8dc7882d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 Thread 7 (Thread 0x7fad83ade700 (LWP 30056)): #0 0x7fad8dc78e23 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84 #1 0x7fad8e808a58 in ?? () from /usr/lib/x86_64-linux-gnu/libglusterfs.so.0 #2 0x7fad8df4270a in start_thread (arg=0x7fad83ade700) at pthread_create.c:333 #3 0x7fad8dc7882d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 Thread 6 (Thread 0x7fad894a5700 (LWP 30055)): #0 0x7fad8dc78e23 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84 #1 0x7fad8e808a58 in ?? () from /usr/lib/x86_64-linux-gnu/libglusterfs.so.0 #2 0x7fad8df4270a in start_thread (arg=0x7fad894a5700) at pthread_create.c:333 #3 0x7fad8dc7882d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 Thread 5 (Thread 0x7fad8a342700 (LWP 30054)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225 #1 0x7fad8e7ecd98 in syncenv_task () from /usr/lib/x86_64-linux-gnu/libglusterfs.so.0 #2 0x7fad8e7ed970 in syncenv_processor () from /usr/lib/x86_64-linux-gnu/libglusterfs.so.0 #3 0x7fad8df4270a in start_thread (arg=0x7fad8a342700) at pthread_create.c:333 #4 0x7fad8dc7882d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 Thread 4 (Thread 0x7fad8ab43700 (LWP 30053)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225 #1 0x7fad8e7ecd98 in syncenv_task () from /usr/lib/x86_64-linux-gnu/libglusterfs.so.0 #2 0x7fad8e7ed970 in syncenv_processor () from /usr/lib/x86_64-linux-gnu/libglusterfs.so.0 #3 0x7fad8df4270a in start_thread (arg=0x7fad8ab43700) at pthread_create.c:333 #4 0x7fad8dc7882d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 Thread 3 (Thread 0x7fad8b344700 (LWP 30052)): #0 do_sigwait (sig=0x7fad8b343e3c, set=) at ../sysdeps/unix/sysv/linux/sigwait.c:64 #1 __sigwait (set=, sig=0x7fad8b343e3c) at ../sysdeps/unix/sysv/linux/sigwait.c:96 #2 0x004080bf in glusterfs_sigwaiter () #3 0x7fad8df4270a in start_thread (arg=0x7fad8b344700) at pthread_create.c:333 #4 0x7fad8dc7882d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 Thread 2 (Thread 0x7fad8bb45700 (LWP 30051)): #0 0x7fad8df4bc6d in nanosleep () at ../sysdeps/unix/syscall-template.S:84 #1 0x7fad8e7ca744 in gf_timer_proc () from /usr/lib/x86_64-linux-gnu/libglusterfs.so.0 #2 0x7fad8df4270a in start_thread (arg=0x7fad8bb45700) at pthread_create.c:333 #3 0x7fad8dc7882d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 Thread 1 (Thread 0x7fad8ec66780 (LWP 30050)): #0 0x7fad8df439dd in pthread_join (threadid=140383309420288, thread_return=0x0) at pthread_join.c:90 #1 0x7fad8e808eeb in ?? () from /usr/lib/x86_64-linux-gnu/libglusterfs.so.0 #2 0x00405501 in main () - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculo...@mdpi.com Skype: milos.cuculovic.mdpi On 08.12.2016 16:17, Ravishankar N wrote: On 12/08/2016 06:53 PM, Atin Mukherjee wrote: On Thu, Dec 8, 2016 at 6:44 PM, Miloš Čučulović - MDPI mailto:cuculo...@mdpi.com>> wrote: Ah, damn! I found the issue. On the storage server, the storage2 IP address was wrong, I inversed two digits in the /etc/hosts file, sorry for that :( I was able to add the brick now, I s
Re: [Gluster-users] Fwd: Replica brick not working
A note to add, when checking the sudo gluster volume heal storage info I am getting more than 2k Number of entries, but no file on storage server (replica). - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculo...@mdpi.com Skype: milos.cuculovic.mdpi On 08.12.2016 14:23, Atin Mukherjee wrote: On Thu, Dec 8, 2016 at 6:44 PM, Miloš Čučulović - MDPI mailto:cuculo...@mdpi.com>> wrote: Ah, damn! I found the issue. On the storage server, the storage2 IP address was wrong, I inversed two digits in the /etc/hosts file, sorry for that :( I was able to add the brick now, I started the heal, but still no data transfer visible. @Ravi/Pranith - can you help here? By doing gluster volume status, I have Status of volume: storage Gluster process TCP Port RDMA Port Online Pid -- Brick storage2:/data/data-cluster 49152 0 Y 23101 Brick storage:/data/data-cluster 49152 0 Y 30773 Self-heal Daemon on localhost N/A N/AY 30050 Self-heal Daemon on storage N/A N/AY 30792 Any idea? On storage I have: Number of Peers: 1 Hostname: 195.65.194.217 Uuid: 7c988af2-9f76-4843-8e6f-d94866d57bb0 State: Peer in Cluster (Connected) - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculo...@mdpi.com <mailto:cuculo...@mdpi.com> Skype: milos.cuculovic.mdpi On 08.12.2016 13:55, Atin Mukherjee wrote: Can you resend the attachment as zip? I am unable to extract the content? We shouldn't have 0 info file. What does gluster peer status output say? On Thu, Dec 8, 2016 at 4:51 PM, Miloš Čučulović - MDPI mailto:cuculo...@mdpi.com> <mailto:cuculo...@mdpi.com <mailto:cuculo...@mdpi.com>>> wrote: I hope you received my last email Atin, thank you! - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculo...@mdpi.com <mailto:cuculo...@mdpi.com> <mailto:cuculo...@mdpi.com <mailto:cuculo...@mdpi.com>> Skype: milos.cuculovic.mdpi On 08.12.2016 10:28, Atin Mukherjee wrote: -- Forwarded message -- From: *Atin Mukherjee* mailto:amukh...@redhat.com> <mailto:amukh...@redhat.com <mailto:amukh...@redhat.com>> <mailto:amukh...@redhat.com <mailto:amukh...@redhat.com> <mailto:amukh...@redhat.com <mailto:amukh...@redhat.com>>>> Date: Thu, Dec 8, 2016 at 11:56 AM Subject: Re: [Gluster-users] Replica brick not working To: Ravishankar N mailto:ravishan...@redhat.com> <mailto:ravishan...@redhat.com <mailto:ravishan...@redhat.com>> <mailto:ravishan...@redhat.com <mailto:ravishan...@redhat.com> <mailto:ravishan...@redhat.com <mailto:ravishan...@redhat.com>>>> Cc: Miloš Čučulović - MDPI mailto:cuculo...@mdpi.com> <mailto:cuculo...@mdpi.com <mailto:cuculo...@mdpi.com>> <mailto:cuculo...@mdpi.com <mailto:cuculo...@mdpi.com> <mailto:cuculo...@mdpi.com <mailto:cuculo...@mdpi.com>>>>, Pranith Kumar Karampuri mailto:pkara...@redhat.com> <mailto:pkara...@redhat.com <mailto:pkara...@redhat.com>> <mailto:pkara...@redhat.com <mailto:pkara...@redhat.com> <mailto:pkara...@redhat.com <mailto:pkara...@redhat.com>>>>, gluster-users mailto:gluster-users@gluster.org> <mailto:gluster-users@gluster.org <mailto:gluster-users@gluster.org>> <mailto:gluster-users@gluster.org <mailto:gluster-users@gluster.org> <mailto:gluster-users@gluster.org <mailto:gluster-users@gluster.org>>>> On Thu, Dec 8, 2016 at 11:11 AM, Ravishankar N mailto:ravishan...@redhat.com> <mailto:ravish
Re: [Gluster-users] Replica brick not working
Please find attached the required files from storage2 server. The storage server has no /var/lib/glusterd/vols// files. - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculo...@mdpi.com Skype: milos.cuculovic.mdpi On 08.12.2016 07:26, Atin Mukherjee wrote: On Thu, Dec 8, 2016 at 11:11 AM, Ravishankar N mailto:ravishan...@redhat.com>> wrote: On 12/08/2016 10:43 AM, Atin Mukherjee wrote: >From the log snippet: [2016-12-07 09:15:35.677645] I [MSGID: 106482] [glusterd-brick-ops.c:442:__glusterd_handle_add_brick] 0-management: Received add brick req [2016-12-07 09:15:35.677708] I [MSGID: 106062] [glusterd-brick-ops.c:494:__glusterd_handle_add_brick] 0-management: replica-count is 2 [2016-12-07 09:15:35.677735] E [MSGID: 106291] [glusterd-brick-ops.c:614:__glusterd_handle_add_brick] 0-management: The last log entry indicates that we hit the code path in gd_addbr_validate_replica_count () if (replica_count == volinfo->replica_count) { if (!(total_bricks % volinfo->dist_leaf_count)) { ret = 1; goto out; } } It seems unlikely that this snippet was hit because we print the E [MSGID: 106291] in the above message only if ret==-1. gd_addbr_validate_replica_count() returns -1 and yet not populates err_str only when in volinfo->type doesn't match any of the known volume types, so volinfo->type is corrupted perhaps? You are right, I missed that ret is set to 1 here in the above snippet. @Milos - Can you please provide us the volume info file from /var/lib/glusterd/vols// from all the three nodes to continue the analysis? -Ravi @Pranith, Ravi - Milos was trying to convert a dist (1 X 1) volume to a replicate (1 X 2) using add brick and hit this issue where add-brick failed. The cluster is operating with 3.7.6. Could you help on what scenario this code path can be hit? One straight forward issue I see here is missing err_str in this path. -- ~ Atin (atinm) storage Description: application/http-index-format ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Replica brick not working
Dear All, I have two servers, storage and storage2. The storage2 had a volume called storage. I then decided to add a replica brick (storage). I did this in the following way: 1. sudo gluster peer probe storage (on storage server2) 2. sudo gluster volume add-brick storage replica 2 storage:/data/data-cluster Then I was getting the following error: volume add-brick: failed: Operation failed But, it seems the brick was somehow added, as when checking on storage2: sudo gluster volume info storage I am getting: Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: storage2:/data/data-cluster Brick2: storage:/data/data-cluster So, seems ok here, however, when doing: sudo gluster volume heal storage info I am getting: Volume storage is not of type replicate/disperse Volume heal failed. Also, when doing sudo gluster volume status all I am getting: Status of volume: storage Gluster process TCP Port RDMA Port Online Pid -- Brick storage2:/data/data-cluster49152 0 Y 2160 Brick storage:/data/data-cluster N/A N/AN N/A Self-heal Daemon on localhostN/A N/AY 7906 Self-heal Daemon on storage N/A N/AN N/A Task Status of Volume storage -- Any idea please? -- - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculo...@mdpi.com Skype: milos.cuculovic.mdpi ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Add new server to glusterFS (replica)
Thanks a lot! I will try this asap, but before that, I have an important question: is there any downtime on the server1 while installing the replica on server2. Our server1 is a production server so I cannot allow downtime. - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculo...@mdpi.com Skype: milos.cuculovic.mdpi On 02.12.2016 18:17, Page, Garrett S wrote: These are the correct steps. I'll add that we did some experiments and pre-staging the data on the new brick did seem to reduce the rebuild time. -Original Message- From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Nag Pavan Chilakam Sent: Friday, December 02, 2016 8:42 AM To: Miloš Čučulović - MDPI Cc: gluster-users@gluster.org Subject: Re: [Gluster-users] Add new server to glusterFS (replica) Hi Milos, You need to follow the below steps: 1)once you identify server2, install the same version gluster packages as in server1 2) peer server2 from server1 3) do a "gluster volume add-brick replica 2 The syncing of files in your gluster storage will happen automatically when self-heal deamon picks up the job. If you don't want to wait, issue a "gluster volume heal thanks, nag pavan - Original Message - From: "Miloš Čučulović - MDPI" To: gluster-users@gluster.org Sent: Friday, 2 December, 2016 9:38:55 PM Subject: [Gluster-users] Add new server to glusterFS (replica) Hi All, I have a running GlusterFS volume with one brick, the actual server1. I purchased a 2nd server and would like to add it as replica. Could someone propose a tutorial? Should I first sync the files between server1 and server2? Thank you! -- - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculo...@mdpi.com Skype: milos.cuculovic.mdpi ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Add new server to glusterFS (replica)
Hi All, I have a running GlusterFS volume with one brick, the actual server1. I purchased a 2nd server and would like to add it as replica. Could someone propose a tutorial? Should I first sync the files between server1 and server2? Thank you! -- - Kindest regards, Milos Cuculovic IT Manager --- MDPI AG Postfach, CH-4020 Basel, Switzerland Office: St. Alban-Anlage 66, 4052 Basel, Switzerland Tel. +41 61 683 77 35 Fax +41 61 302 89 18 Email: cuculo...@mdpi.com Skype: milos.cuculovic.mdpi ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users