Re: [Gluster-users] Gluster Self Heal
On 09/07/13 18:17, 符永涛 wrote: Hi Toby, What's the bug #? I want to have a look and backport it to our production server if it helps. Thank you. I think it was this one: https://bugzilla.redhat.com/show_bug.cgi?id=947824 The bug being that the daemons were crashing out if you had a lot of volumes defined, I think? Toby 2013/7/9 Toby Corkindale toby.corkind...@strategicdata.com.au mailto:toby.corkind...@strategicdata.com.au On 09/07/13 15:38, Bobby Jacob wrote: Hi, I have a 2-node gluster with 3 TB storage. 1)I believe the “glusterfsd” is responsible for the self healing between the 2 nodes. 2)Due to some network error, the replication stopped for some reason but the application was accessing the data from node1. When I manually try to start “glusterfsd” service, its not starting. Please advice on how I can maintain the integrity of the data so that we have all the data in both the locations. ?? There were some bugs in the self-heal daemon present in 3.3.0 and 3.3.1. Our systems see the SHD crash out with segfaults quite often, and it does not recover. I reported this bug a long time ago, and it was fixed in trunk relatively quickly -- however version 3.3.2 has still not been released, despite the fix being found six months ago. I find this quite disappointing. T _ Gluster-users mailing list Gluster-users@gluster.org mailto:Gluster-users@gluster.org http://supercolony.gluster.__org/mailman/listinfo/gluster-__users http://supercolony.gluster.org/mailman/listinfo/gluster-users -- 符永涛 ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster Self Heal
On 07/10/2013 01:31 PM, Toby Corkindale wrote: On 09/07/13 18:17, 符永涛 wrote: Hi Toby, What's the bug #? I want to have a look and backport it to our production server if it helps. Thank you. I think it was this one: https://bugzilla.redhat.com/show_bug.cgi?id=947824 The bug being that the daemons were crashing out if you had a lot of volumes defined, I think? A lot of volumes or a lot of delta to self-heal could trigger this crash. 3.3.2 containing this fix should be out real soon now. Appreciate your patience in this regard. Thanks, Vijay ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster Self Heal
On 10/07/2013 04:05 μμ, Vijay Bellur wrote: A lot of volumes or a lot of delta to self-heal could trigger this crash. 3.3.2 containing this fix should be out real soon now. Appreciate your patience in this regard. Thanks, Vijay I hope this update will reach the debian wheezy repo. Regards. Harry ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster Self Heal
On 09/07/13 15:38, Bobby Jacob wrote: Hi, I have a 2-node gluster with 3 TB storage. 1)I believe the “glusterfsd” is responsible for the self healing between the 2 nodes. 2)Due to some network error, the replication stopped for some reason but the application was accessing the data from node1. When I manually try to start “glusterfsd” service, its not starting. Please advice on how I can maintain the integrity of the data so that we have all the data in both the locations. ?? There were some bugs in the self-heal daemon present in 3.3.0 and 3.3.1. Our systems see the SHD crash out with segfaults quite often, and it does not recover. I reported this bug a long time ago, and it was fixed in trunk relatively quickly -- however version 3.3.2 has still not been released, despite the fix being found six months ago. I find this quite disappointing. T ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster Self Heal
OK, So is there any workaround. ?? I have reployed GlusterFS 3.3.1. I have kept it real simple. Type: Replicate Volume ID: 3e002989-6c9f-4f83-9bd5-c8a3442d8721 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: KWTTESTGSNODE002:/mnt/cloudbrick Brick2: ZAJILTESTGSNODE001:/mnt/cloudbrick Thanks Regards, Bobby Jacob -Original Message- From: gluster-users-boun...@gluster.org [mailto:gluster-users-boun...@gluster.org] On Behalf Of Toby Corkindale Sent: Tuesday, July 09, 2013 9:50 AM To: gluster-users@gluster.org Subject: Re: [Gluster-users] Gluster Self Heal On 09/07/13 15:38, Bobby Jacob wrote: Hi, I have a 2-node gluster with 3 TB storage. 1)I believe the “glusterfsd” is responsible for the self healing between the 2 nodes. 2)Due to some network error, the replication stopped for some reason but the application was accessing the data from node1. When I manually try to start “glusterfsd” service, its not starting. Please advice on how I can maintain the integrity of the data so that we have all the data in both the locations. ?? There were some bugs in the self-heal daemon present in 3.3.0 and 3.3.1. Our systems see the SHD crash out with segfaults quite often, and it does not recover. I reported this bug a long time ago, and it was fixed in trunk relatively quickly -- however version 3.3.2 has still not been released, despite the fix being found six months ago. I find this quite disappointing. T ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster Self Heal
Hi Toby, What's the bug #? I want to have a look and backport it to our production server if it helps. Thank you. 2013/7/9 Toby Corkindale toby.corkind...@strategicdata.com.au On 09/07/13 15:38, Bobby Jacob wrote: Hi, I have a 2-node gluster with 3 TB storage. 1)I believe the “glusterfsd” is responsible for the self healing between the 2 nodes. 2)Due to some network error, the replication stopped for some reason but the application was accessing the data from node1. When I manually try to start “glusterfsd” service, its not starting. Please advice on how I can maintain the integrity of the data so that we have all the data in both the locations. ?? There were some bugs in the self-heal daemon present in 3.3.0 and 3.3.1. Our systems see the SHD crash out with segfaults quite often, and it does not recover. I reported this bug a long time ago, and it was fixed in trunk relatively quickly -- however version 3.3.2 has still not been released, despite the fix being found six months ago. I find this quite disappointing. T __**_ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.**org/mailman/listinfo/gluster-**usershttp://supercolony.gluster.org/mailman/listinfo/gluster-users -- 符永涛 ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Gluster Self Heal
On 07/09/2013 11:08 AM, Bobby Jacob wrote: Hi, I have a 2-node gluster with 3 TB storage. 1)I believe the “glusterfsd” is responsible for the self healing between the 2 nodes. glustershd or self-heal-daemon is responsible for self healing between 2 nodes. 2)Due to some network error, the replication stopped for some reason but the application was accessing the data from node1. When I manually try to start “glusterfsd” service, its not starting. You can attempt gluster volume start volname force to spawn those services which are offline. Please advice on how I can maintain the integrity of the data so that we have all the data in both the locations. ?? If gluster volume status list all your processes as online, you should be doing fine. -Vijay ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users