Hopefully I'm not derailing this thread too far, but I have a related rebalance progress/speed issue.
I have a rebalance process started that's been running for 3-4 days. Is there a good way to see if it's running successfully, or might this be a sign of some problem? This is on a 4-node distribute setup with v3.4.2 and 45T of data. The *-rebalance.log has been silent since some informational messages when the rebalance started. There were a few initial warnings and errors that I observed, though: E [client-handshake.c:1397:client_setvolume_cbk] 0-cluster2-client-0: SETVOLUME on remote-host failed: Authentication failed W [client-handshake.c:1365:client_setvolume_cbk] 0-cluster2-client-4: failed to set the volume (Permission denied) W [client-handshake.c:1391:client_setvolume_cbk] 0-cluster2-client-4: failed to get 'process-uuid' from reply dict W [socket.c:514:__socket_rwv] 0-cluster2-client-3: readv failed (No data available) "gluster volume status" reports that the rebalance is in progress, the process listed in vols/<volname>/rebalance/<hash>.pid is still running on the server, but "gluster volume rebalance <volname> status" reports 0 for everything (files scanned or rebalanced, failures, run time). Thanks, Matt On Thu, Feb 27, 2014 at 12:39 AM, Shylesh Kumar <shmo...@redhat.com> wrote: > Hi Viktor, > > Lots of optimizations and improvements went in for 3.4 so it should be > faster than 3.2. > Just to make sure what's happening could you please check rebalance logs > which will be in > /var/log/glusterfs/<volname>-rebalance.log and check is there any > progress ? > > Thanks, > Shylesh > > > Viktor Villafuerte wrote: > >> Anybody can confirm/dispute that this is normal/abnormal? >> >> v >> >> >> On Tue 25 Feb 2014 15:21:40, Viktor Villafuerte wrote: >> >>> Hi all, >>> >>> I have distributed replicated set with 2 servers (replicas) and am >>> trying to add another set of replicas: 1 x (1x1) => 2 x (1x1) >>> >>> I have about 23G of data which I copy onto the first replica, check >>> everything and then add the other set of replicas and eventually >>> rebalance fix-layout, migrate-data. >>> >>> Now on >>> >>> Gluster v3.2.5 this took about 30 mins (to rebalance + migrate-data) >>> >>> on >>> >>> Gluster v3.4.2 this has been running for almost 4 hours and it's still >>> not finished >>> >>> >>> As I may have to do this in production, where the amount of data is >>> significantly larger than 23G, I'm looking at about three weeks of wait >>> to rebalance :) >>> >>> Now my question is if this is as it's meant to be? I can see that v3.4.2 >>> gives me more info about the rebalance process etc, but that surely >>> cannot justify the enormous time difference. >>> >>> Is this normal/expected behaviour? If so I will have to stick with the >>> v3.2.5 as it seems way quicker. >>> >>> Please, let me know if there is any 'well known' option/way/secret to >>> speed the rebalance up on v3.4.2. >>> >>> >>> thanks >>> >>> >>> >>> -- >>> Regards >>> >>> Viktor Villafuerte >>> Optus Internet Engineering >>> t: 02 808-25265 >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users@gluster.org >>> http://supercolony.gluster.org/mailman/listinfo/gluster-users >>> >> > _______________________________________________ > Gluster-users mailing list > Gluster-users@gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users >
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users