[Gluster-users] Big problem
Hi, I had a 4 x 3 gluster volume distributed over 6 servers (2 bricks from each). I wanted to move to 4 x 2 volume removing two nodes. The initial config is here: http://fpaste.org/txxs/ I asked for the command how to do it and got it from the gluster IRC. I then proceeded to run it: gluster volume remove-brick home0 replica 2 192.168.1.243:/d35 192.168.1.240:/d35 192.168.1.243:/d36 192.168.1.240:/d36 having had read the gluster help output I assertained I should probably add start to the end to have it gracefully check everything (it did warn me without of possible data loss). However the result was that it started rebalancing and immediately had reconfigured the volume to 6 x 2 replica sets so now I have a HUGE mess: http://fpaste.org/EpKG/ Most processes failed and directory listings come in double: [root@wn-c-27 test]# ls ls: cannot access hadoop-fuse-addon.tgz: No such file or directory ls: cannot access hadoop-fuse-addon.tgz: No such file or directory etc hadoop-fuse-addon.tgz hadoop-fuse-addon.tgz [root@wn-c-27 test]# I need urgently help how to recover from this state? It seems gluster now has me in a huge mess and it will be tough to get out of it. Immediately when I noticed this I stopped the brick-remove with stop command, but the mess is as it is. Should I force the remove brick? Should I stop the volume and stop gluster and manually reconfigure it to 4x3 or how can I recover to a consistent filesystem. This is users /home so a huge mess is NOT a good thing. Due to 3x replication there is no backup right now either... Mario Kadastik, PhD Researcher --- Physics is like sex, sure it may have practical reasons, but that's not why we do it -- Richard P. Feynman ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] glusterfs performance issues
You have a replicated filesystem, brick1 and brick2. Brick 2 goes down and you edit a 4k file, appending data to it. That change, and the fact that there is a pending change, is stored on brick1. Brick2 returns to service. Your app wants to append to the file again. It calls stat on the file. Brick2 answers first stating that the file is 4k long. Your app seeks to 4k and writes. Now the data you wrote before is gone. This is one of the processes by which stale stat data can cause data loss. That's why each lookup() (which precedes the stat) causes a self-heal check and why it's a problem that hasn't been resolved in the last two years. I don't know the answer. I know that they want this problem to be solved, but right now the best solution is hardware. The lower the latency, the less of a problem you'll have. Well I'd assume that the brick that comes online has to check everything from the other online bricks before it's authoritative in answering any client calls. This way if a brick comes up and sees another brick(s) in its replica blocks online the assumption should be that the data can be bad on this brick. Therefore until a complete self heal is performed the brick should be considered bad for this information. The next step from this is how to guarantee that a brick actually returns to healthy state in a busy filesystem. The basic way would be that any new writes of files are written to all bricks (including the bad) and declared good on the healing brick and as a background process all files on the brick are hashed and checked against the bricks that were live before. In a reasonable environment this should complete in a reasonable amount of time and at worst means that you'll be running at a reduced performance while this sync is happening, but it would guarantee that you don't have data loss unless you lose all your previously online bricks in which case you're anyway in disaster recovery where this semi-live brick can help in recovering files from the time it last went down or better. Mario Kadastik, PhD Researcher --- Physics is like sex, sure it may have practical reasons, but that's not why we do it -- Richard P. Feynman ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] how well will this work
I'm going to make this as simple as possible. Every message to this list should follow these rules: 1. be helpful 2. be constructive 3. be respectful I will not tolerate ranting that serves no purpose. If your message doesn't follow any of the rules above, then you shouldn't be posting it. Might be jumping in here at a random spot, but looking at Stephan's e-mail it was all three. It was helpful and constructive by outlining a concrete strategy that would make glusterfs greater in his opinion and to an extent that's something I share, performance IS an issue and makes me hesitate in moving glusterfs to the next level at our site (right now we have a 6 node 12 brick configuration that's used extensively as /home, target would be a 180 node 2PB distributed 2-way replicated installation). We hit FUSE snags from day 2 and are running on NFS right now because negative lookup caching is not in FUSE. In fact there is no caching. And NFS has hiccups that cause issues especially for us because we use vz containers with bind mounting so if the headnode nfs goes stale we have to hack a lot to get the stale mount remounted in all the VZ images. I've had at least two or three instances where I had to stop all the containers killing user tasks to remount stably. And to be fair at least in this particular e-mail I didn't really see much disrespect, just some comparisons that I think still remained in respectful range. Mario Kadastik, PhD Researcher --- Physics is like sex, sure it may have practical reasons, but that's not why we do it -- Richard P. Feynman ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] stuck lock
-ops.c:548:glusterd3_1_cluster_lock_cbk] 0-glusterd: Received ACC from uuid: f1a89ed2-a2f5-49a9-9482-1c6984c37945 [2012-12-13 15:09:33.566024] I [glusterd-rpc-ops.c:548:glusterd3_1_cluster_lock_cbk] 0-glusterd: Received ACC from uuid: b1ce84be-de0b-4ae1-a1e8-758d828b8872 [2012-12-13 15:09:33.566047] I [glusterd-rpc-ops.c:548:glusterd3_1_cluster_lock_cbk] 0-glusterd: Received ACC from uuid: 0f61d484-0f93-4144-b166-2145f4ea4427 [2012-12-13 15:09:33.566069] I [glusterd-rpc-ops.c:548:glusterd3_1_cluster_lock_cbk] 0-glusterd: Received ACC from uuid: d9b48655-4b25-4ad2-be19-c5ec8768a789 [2012-12-13 15:09:33.566224] I [glusterd-op-sm.c:2039:glusterd_op_ac_send_stage_op] 0-glusterd: Sent op req to 5 peers [2012-12-13 15:09:33.566420] I [glusterd-rpc-ops.c:881:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from uuid: b1ce84be-de0b-4ae1-a1e8-758d828b8872 [2012-12-13 15:09:33.566450] I [glusterd-rpc-ops.c:881:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from uuid: d9b48655-4b25-4ad2-be19-c5ec8768a789 [2012-12-13 15:09:33.566499] I [glusterd-rpc-ops.c:881:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from uuid: f1a89ed2-a2f5-49a9-9482-1c6984c37945 [2012-12-13 15:09:33.566524] I [glusterd-rpc-ops.c:881:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from uuid: 0f61d484-0f93-4144-b166-2145f4ea4427 [2012-12-13 15:09:33.57] I [glusterd-rpc-ops.c:881:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from uuid: 663ecbfb-4209-417e-a955-6c9f72751dbc hangs here ctrl+C [root@se1 home0]# gluster volume heal home0 operation failed [root@se1 home0]# == cli.log == [2012-12-13 15:10:00.686308] W [rpc-transport.c:174:rpc_transport_load] 0-rpc-transport: missing 'option transport-type'. defaulting to socket [2012-12-13 15:10:00.842108] I [cli-rpc-ops.c:5928:gf_cli3_1_heal_volume_cbk] 0-cli: Received resp to heal volume [2012-12-13 15:10:00.842187] I [input.c:46:cli_batch] 0-: Exiting with: -1 == etc-glusterfs-glusterd.vol.log == [2012-12-13 15:10:00.841789] I [glusterd-volume-ops.c:492:glusterd_handle_cli_heal_volume] 0-management: Received heal vol req for volume home0 [2012-12-13 15:10:00.841910] E [glusterd-utils.c:277:glusterd_lock] 0-glusterd: Unable to get lock for uuid: c3ce6b9c-6297-4e77-924c-b44e2c13e58f, lock held by: c3ce6b9c-6297-4e77-924c-b44e2c13e58f [2012-12-13 15:10:00.841926] E [glusterd-handler.c:458:glusterd_op_txn_begin] 0-management: Unable to acquire local lock, ret: -1 Mario Kadastik, PhD Researcher --- Physics is like sex, sure it may have practical reasons, but that's not why we do it -- Richard P. Feynman ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] stuck lock
Hi all, two updates firstly. All gluster nodes right now are Scientific Linux 5.7 (Linux se1 2.6.18-308.16.1.el5 #1 SMP Wed Oct 3 00:53:20 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux) with gluster version 3.3.1. The client nodes that mount the volume over NFS are CentOS 6.3. Also, finally after all else failed I unmounted (sometimes by force) the volume from everywhere, stopped glusterd and glusterfsd and after restarting them the issue had disappeared. However this is NOT a way I'd like to fix things as it was very very disruptive. Mario Kadastik, PhD Researcher --- Physics is like sex, sure it may have practical reasons, but that's not why we do it -- Richard P. Feynman ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users