[Gluster-users] Big problem

2013-01-21 Thread Mario Kadastik
Hi,

I had a 4 x 3 gluster volume distributed over 6 servers (2 bricks from each). I 
wanted to move to 4 x 2 volume removing two nodes. The initial config is here:

http://fpaste.org/txxs/

I asked for the command how to do it and got it from the gluster IRC. I then 
proceeded to run it:

gluster volume remove-brick home0 replica 2 192.168.1.243:/d35 
192.168.1.240:/d35 192.168.1.243:/d36 192.168.1.240:/d36

having had read the gluster help output I assertained I should probably add 
start to the end to have it gracefully check everything (it did warn me without 
of possible data loss). However the result was that it started rebalancing and 
immediately had reconfigured the volume to 6 x 2 replica sets so now I have a 
HUGE mess:

http://fpaste.org/EpKG/

Most processes failed and directory listings come in double:

[root@wn-c-27 test]# ls
ls: cannot access hadoop-fuse-addon.tgz: No such file or directory
ls: cannot access hadoop-fuse-addon.tgz: No such file or directory
etc  hadoop-fuse-addon.tgz  hadoop-fuse-addon.tgz
[root@wn-c-27 test]# 

I need urgently help how to recover from this state? It seems gluster now has 
me in a huge mess and it will be tough to get out of it. Immediately when I 
noticed this I stopped the brick-remove with stop command, but the mess is as 
it is. Should I force the remove brick? Should I stop the volume and stop 
gluster and manually reconfigure it to 4x3 or how can I recover to a consistent 
filesystem. This is users /home so a huge mess is NOT a good thing. Due to 3x 
replication there is no backup right now either...

Mario Kadastik, PhD
Researcher

---
  Physics is like sex, sure it may have practical reasons, but that's not why 
we do it 
 -- Richard P. Feynman

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] glusterfs performance issues

2013-01-07 Thread Mario Kadastik
 You have a replicated filesystem, brick1 and brick2.
 Brick 2 goes down and you edit a 4k file, appending data to it.
 That change, and the fact that there is a pending change, is stored on brick1.
 Brick2 returns to service.
 Your app wants to append to the file again. It calls stat on the file. Brick2 
 answers first stating that the file is 4k long. Your app seeks to 4k and 
 writes. Now the data you wrote before is gone.
 
 This is one of the processes by which stale stat data can cause data loss. 
 That's why each lookup() (which precedes the stat) causes a self-heal check 
 and why it's a problem that hasn't been resolved in the last two years.
 
 I don't know the answer. I know that they want this problem to be solved, but 
 right now the best solution is hardware. The lower the latency, the less of a 
 problem you'll have.

Well I'd assume that the brick that comes online has to check everything from 
the other online bricks before it's authoritative in answering any client 
calls. This way if a brick comes up and sees another brick(s) in its replica 
blocks online the assumption should be that the data can be bad on this brick. 
Therefore until a complete self heal is performed the brick should be 
considered bad for this information. The next step from this is how to 
guarantee that a brick actually returns to healthy state in a busy filesystem. 
The basic way would be that any new writes of files are written to all bricks 
(including the bad) and declared good on the healing brick and as a background 
process all files on the brick are hashed and checked against the bricks that 
were live before. In a reasonable environment this should complete in a 
reasonable amount of time and at worst means that you'll be running at a 
reduced performance while this sync is happening, but it would guarantee that 
you 
 don't have data loss unless you lose all your previously online bricks in 
which case you're anyway in disaster recovery where this semi-live brick can 
help in recovering files from the time it last went down or better. 

Mario Kadastik, PhD
Researcher

---
  Physics is like sex, sure it may have practical reasons, but that's not why 
we do it 
 -- Richard P. Feynman

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] how well will this work

2012-12-27 Thread Mario Kadastik
 I'm going to make this as simple as possible. Every message to this list 
 should follow these rules:
 
 1. be helpful
 2. be constructive
 3. be respectful
 
 I will not tolerate ranting that serves no purpose. If your message doesn't 
 follow any of the rules above, then you shouldn't be posting it.

Might be jumping in here at a random spot, but looking at Stephan's e-mail it 
was all three. It was helpful and constructive by outlining a concrete strategy 
that would make glusterfs greater in his opinion and to an extent that's 
something I share, performance IS an issue and makes me hesitate in moving 
glusterfs to the next level at our site (right now we have a 6 node 12 brick 
configuration that's used extensively as /home, target would be a 180 node 2PB 
distributed 2-way replicated installation). We hit FUSE snags from day 2 and 
are running on NFS right now because negative lookup caching is not in FUSE. In 
fact there is no caching. And NFS has hiccups that cause issues especially for 
us because we use vz containers with bind mounting so if the headnode nfs goes 
stale we have to hack a lot to get the stale mount remounted in all the VZ 
images. I've had at least two or three instances where I had to stop all the 
containers killing user tasks to remount stably. 

And to be fair at least in this particular e-mail I didn't really see much 
disrespect, just some comparisons that I think still remained in respectful 
range. 

Mario Kadastik, PhD
Researcher

---
 Physics is like sex, sure it may have practical reasons, but that's not why 
we do it 
-- Richard P. Feynman

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] stuck lock

2012-12-13 Thread Mario Kadastik
-ops.c:548:glusterd3_1_cluster_lock_cbk] 0-glusterd: Received ACC 
from uuid: f1a89ed2-a2f5-49a9-9482-1c6984c37945
[2012-12-13 15:09:33.566024] I 
[glusterd-rpc-ops.c:548:glusterd3_1_cluster_lock_cbk] 0-glusterd: Received ACC 
from uuid: b1ce84be-de0b-4ae1-a1e8-758d828b8872
[2012-12-13 15:09:33.566047] I 
[glusterd-rpc-ops.c:548:glusterd3_1_cluster_lock_cbk] 0-glusterd: Received ACC 
from uuid: 0f61d484-0f93-4144-b166-2145f4ea4427
[2012-12-13 15:09:33.566069] I 
[glusterd-rpc-ops.c:548:glusterd3_1_cluster_lock_cbk] 0-glusterd: Received ACC 
from uuid: d9b48655-4b25-4ad2-be19-c5ec8768a789
[2012-12-13 15:09:33.566224] I 
[glusterd-op-sm.c:2039:glusterd_op_ac_send_stage_op] 0-glusterd: Sent op req to 
5 peers
[2012-12-13 15:09:33.566420] I 
[glusterd-rpc-ops.c:881:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from 
uuid: b1ce84be-de0b-4ae1-a1e8-758d828b8872
[2012-12-13 15:09:33.566450] I 
[glusterd-rpc-ops.c:881:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from 
uuid: d9b48655-4b25-4ad2-be19-c5ec8768a789
[2012-12-13 15:09:33.566499] I 
[glusterd-rpc-ops.c:881:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from 
uuid: f1a89ed2-a2f5-49a9-9482-1c6984c37945
[2012-12-13 15:09:33.566524] I 
[glusterd-rpc-ops.c:881:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from 
uuid: 0f61d484-0f93-4144-b166-2145f4ea4427
[2012-12-13 15:09:33.57] I 
[glusterd-rpc-ops.c:881:glusterd3_1_stage_op_cbk] 0-glusterd: Received ACC from 
uuid: 663ecbfb-4209-417e-a955-6c9f72751dbc

hangs here 
ctrl+C

[root@se1 home0]# gluster volume heal home0
operation failed
[root@se1 home0]# 
== cli.log ==
[2012-12-13 15:10:00.686308] W [rpc-transport.c:174:rpc_transport_load] 
0-rpc-transport: missing 'option transport-type'. defaulting to socket
[2012-12-13 15:10:00.842108] I [cli-rpc-ops.c:5928:gf_cli3_1_heal_volume_cbk] 
0-cli: Received resp to heal volume
[2012-12-13 15:10:00.842187] I [input.c:46:cli_batch] 0-: Exiting with: -1

== etc-glusterfs-glusterd.vol.log ==
[2012-12-13 15:10:00.841789] I 
[glusterd-volume-ops.c:492:glusterd_handle_cli_heal_volume] 0-management: 
Received heal vol req for volume home0
[2012-12-13 15:10:00.841910] E [glusterd-utils.c:277:glusterd_lock] 0-glusterd: 
Unable to get lock for uuid: c3ce6b9c-6297-4e77-924c-b44e2c13e58f, lock held 
by: c3ce6b9c-6297-4e77-924c-b44e2c13e58f
[2012-12-13 15:10:00.841926] E [glusterd-handler.c:458:glusterd_op_txn_begin] 
0-management: Unable to acquire local lock, ret: -1

Mario Kadastik, PhD
Researcher

---
  Physics is like sex, sure it may have practical reasons, but that's not why 
we do it 
 -- Richard P. Feynman

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] stuck lock

2012-12-13 Thread Mario Kadastik
Hi all,

two updates firstly. All gluster nodes right now are Scientific Linux 5.7 
(Linux se1 2.6.18-308.16.1.el5 #1 SMP Wed Oct 3 00:53:20 EDT 2012 x86_64 x86_64 
x86_64 GNU/Linux) with gluster version 3.3.1. The client nodes that mount the 
volume over NFS are CentOS 6.3. 

Also, finally after all else failed I unmounted (sometimes by force) the volume 
from everywhere, stopped glusterd and glusterfsd and after restarting them the 
issue had disappeared. However this is NOT a way I'd like to fix things as it 
was very very disruptive. 

Mario Kadastik, PhD
Researcher

---
  Physics is like sex, sure it may have practical reasons, but that's not why 
we do it 
 -- Richard P. Feynman

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users