Re: [Gluster-users] Gluster rebalance taking many years

Nithya Balachandran Mon, 30 Apr 2018 00:17:37 -0700

Hi,


This value is an ongoing rough estimate based on the amount of data
rebalance has migrated since it started. The values will cange as the
rebalance progresses.
A few questions:

   1. How many files/dirs do you have on this volume?
   2. What is the average size of the files?
   3. What is the total size of the data on the volume?


Can you send us the rebalance log?


Thanks,
Nithya

On 30 April 2018 at 10:33, kiwizhang618 <kiwizhang...@gmail.com> wrote:

>  I met a big problem,the cluster rebalance takes a long time after adding
> a new node
>
> gluster volume rebalance web status
>                                     Node Rebalanced-files          size
>     scanned      failures       skipped               status  run time in
> h:m:s
>                                ---------      -----------   -----------
> -----------   -----------   -----------         ------------
> --------------
>                                localhost              900        43.5MB
>        2232             0            69          in progress        0:36:49
>                                 gluster2             1052        39.3MB
>        4393             0          1052          in progress        0:36:49
> Estimated time left for rebalance to complete :     9919:44:34
> volume rebalance: web: success
>
> the rebalance log
> [glusterfsd.c:2511:main] 0-/usr/sbin/glusterfs: Started running
> /usr/sbin/glusterfs version 3.12.8 (args: /usr/sbin/glusterfs -s localhost
> --volfile-id rebalance/web --xlator-option *dht.use-readdirp=yes
> --xlator-option *dht.lookup-unhashed=yes --xlator-option
> *dht.assert-no-child-down=yes --xlator-option
> *replicate*.data-self-heal=off --xlator-option 
> *replicate*.metadata-self-heal=off
> --xlator-option *replicate*.entry-self-heal=off --xlator-option
> *dht.readdir-optimize=on --xlator-option *dht.rebalance-cmd=1
> --xlator-option *dht.node-uuid=d47ad89d-7979-4ede-9aba-e04f020bb4f0
> --xlator-option *dht.commit-hash=3610561770 --socket-file
> /var/run/gluster/gluster-rebalance-bdef10eb-1c83-410c-8ad3-fe286450004b.sock
> --pid-file 
> /var/lib/glusterd/vols/web/rebalance/d47ad89d-7979-4ede-9aba-e04f020bb4f0.pid
> -l /var/log/glusterfs/web-rebalance.log)
> [2018-04-30 04:20:45.100902] W [MSGID: 101002] [options.c:995:xl_opt_validate]
> 0-glusterfs: option 'address-family' is deprecated, preferred is
> 'transport.address-family', continuing with correction
> [2018-04-30 04:20:45.103927] I [MSGID: 101190] 
> [event-epoll.c:613:event_dispatch_epoll_worker]
> 0-epoll: Started thread with index 1
> [2018-04-30 04:20:55.191261] E [MSGID: 109039] 
> [dht-common.c:3113:dht_find_local_subvol_cbk]
> 0-web-dht: getxattr err for dir [No data available]
> [2018-04-30 04:21:19.783469] E [MSGID: 109023] 
> [dht-rebalance.c:2669:gf_defrag_migrate_single_file]
> 0-web-dht: Migrate file failed: 
> /2018/02/x187f6596-36ac-45e6-bd7a-019804dfe427.jpg,
> lookup failed [Stale file handle]
> The message "E [MSGID: 109039] [dht-common.c:3113:dht_find_local_subvol_cbk]
> 0-web-dht: getxattr err for dir [No data available]" repeated 2 times
> between [2018-04-30 04:20:55.191261] and [2018-04-30 04:20:55.193615]
>
> the gluster info
> Volume Name: web
> Type: Distribute
> Volume ID: bdef10eb-1c83-410c-8ad3-fe286450004b
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 3
> Transport-type: tcp
> Bricks:
> Brick1: gluster1:/home/export/md3/brick
> Brick2: gluster1:/export/md2/brick
> Brick3: gluster2:/home/export/md3/brick
> Options Reconfigured:
> nfs.trusted-sync: on
> nfs.trusted-write: on
> cluster.rebal-throttle: aggressive
> features.inode-quota: off
> features.quota: off
> cluster.shd-wait-qlength: 1024
> transport.address-family: inet
> cluster.lookup-unhashed: auto
> performance.cache-size: 1GB
> performance.client-io-threads: on
> performance.write-behind-window-size: 4MB
> performance.io-thread-count: 8
> performance.force-readdirp: on
> performance.readdir-ahead: on
> cluster.readdir-optimize: on
> performance.high-prio-threads: 8
> performance.flush-behind: on
> performance.write-behind: on
> performance.quick-read: off
> performance.io-cache: on
> performance.read-ahead: off
> server.event-threads: 8
> cluster.lookup-optimize: on
> features.cache-invalidation: on
> features.cache-invalidation-timeout: 600
> performance.stat-prefetch: off
> performance.md-cache-timeout: 60
> network.inode-lru-limit: 90000
> diagnostics.brick-log-level: ERROR
> diagnostics.brick-sys-log-level: ERROR
> diagnostics.client-log-level: ERROR
> diagnostics.client-sys-log-level: ERROR
> cluster.min-free-disk: 20%
> cluster.self-heal-window-size: 16
> cluster.self-heal-readdir-size: 1024
> cluster.background-self-heal-count: 4
> cluster.heal-wait-queue-length: 128
> client.event-threads: 8
> performance.cache-invalidation: on
> nfs.disable: off
> nfs.acl: off
> cluster.brick-multiplex: disable
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>

_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster rebalance taking many years

Reply via email to