[Gluster-users] Directory ctime/mtime not synced on node being healed
Recently, we lost a brick in a 4-node distribute + replica 2 volume. The host was fine so we simply fixed the hardware failure, recreated the zpool and zfs, set the correct trusted.glusterfs.volume-id, restarted the gluster daemons on the host and the heal got to work. The version running is 3.7.4 atop Ubuntu Trusty. However, we’ve noticed that directories are not getting created on the brick being healed with the correct ctime and mtime. Files, however, are being set correctly. $ gluster volume info edc1 Volume Name: edc1 Type: Distributed-Replicate Volume ID: 2f6b5804-e2d8-4400-93e9-b172952b1aae Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: fs4:/fs4/edc1 Brick2: fs5:/fs5/edc1 Brick3: hdfs5:/hdfs5/edc1 Brick4: hdfs6:/hdfs6/edc1 Options Reconfigured: performance.write-behind-window-size: 1GB performance.cache-size: 1GB performance.readdir-ahead: enable performance.read-ahead: enable Example: On the glusterfs mount: File: ‘BSA_9781483021973’ Size: 36 Blocks: 2 IO Block: 131072 directory Device: 19h/25d Inode: 11345194644681878130 Links: 2 Access: (0777/drwxrwxrwx) Uid: ( 1007/ UNKNOWN) Gid: ( 1007/ UNKNOWN) Access: 2015-11-27 04:01:49.520001319 -0800 Modify: 2014-08-29 09:20:50.006294000 -0700 Change: 2015-02-16 00:04:21.312079523 -0800 Birth: - On the unfailed brick: File: ‘BSA_9781483021973’ Size: 10 Blocks: 6 IO Block: 1024 directory Device: 1ah/26d Inode: 25261 Links: 2 Access: (0777/drwxrwxrwx) Uid: ( 1007/ UNKNOWN) Gid: ( 1007/ UNKNOWN) Access: 2015-11-27 04:01:49.520001319 -0800 Modify: 2014-08-29 09:20:50.006294000 -0700 Change: 2015-02-16 00:04:21.312079523 -0800 Birth: - On the failed brick that’s healing: File: ‘BSA_9781483021973’ Size: 10 Blocks: 6 IO Block: 131072 directory Device: 17h/23d Inode: 252324 Links: 2 Access: (0777/drwxrwxrwx) Uid: ( 1007/ UNKNOWN) Gid: ( 1007/ UNKNOWN) Access: 2015-11-27 10:10:35.441261192 -0800 Modify: 2015-11-25 04:07:36.354860631 -0800 Change: 2015-11-25 04:07:36.354860631 -0800 Birth: - Normally, this wouldn’t be an issue, except that the glusterfs is reporting the ctime and mtime of the directories that the failed node is now the authoritative replica for. An example: On a non-failed brick: File: ‘BSA_9780792765073’ Size: 23 Blocks: 6 IO Block: 3072 directory Device: 1ah/26d Inode: 3734793 Links: 2 Access: (0777/drwxrwxrwx) Uid: ( 1007/ UNKNOWN) Gid: ( 1007/ UNKNOWN) Access: 2015-11-27 10:22:25.374931735 -0800 Modify: 2015-03-24 13:56:53.371733811 -0700 Change: 2015-03-24 13:56:53.371733811 -0700 Birth: - On the glusterfs: File: ‘BSA_9780792765073’ Size: 97 Blocks: 2 IO Block: 131072 directory Device: 19h/25d Inode: 13293019492851992284 Links: 2 Access: (0777/drwxrwxrwx) Uid: ( 1007/ UNKNOWN) Gid: ( 1007/ UNKNOWN) Access: 2015-11-27 10:22:20.922782180 -0800 Modify: 2015-11-25 04:03:21.889978948 -0800 Change: 2015-11-25 04:03:21.889978948 -0800 Birth: - Thanks, -t ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] 3.6.3 Ubuntu PPA
Just wondering if we can expect 3.6.3 to make it to launchpad anytime soon? Thanks, -t ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] NFS I/O errors after replicated -> distributed+replicate add-brick
Hi, all: We had a two-node gluster cluster (replicated, 2 replicas) that recently we added two more node/bricks to and performed a rebalance upon, thus making it a distributed-replicate volume. Since doing so, we now see for any NFS access, read or write, a “Remote I/O error” whenever performing any operation (stat, read, write, whatever) although the operation appears to in fact succeed. I don’t actually see any information in the gluster logs that would assist. The bricks are backstored on ZFS vols. Any hints? It’s Gluster 3.6.2 on Ubuntu Trusty. Clients using glusterfs throw some concerning errors as well - see bottom below for examples. Thanks, -t Status of volume: edc1 Gluster process PortOnline Pid -- Brick fs4:/fs4/edc1 49154 Y 2435 Brick fs5:/fs5/edc1 49154 Y 2328 Brick hdfs5:/hdfs5/edc1 49152 Y 26725 Brick hdfs6:/hdfs6/edc1 49152 Y 4994 NFS Server on localhost 2049Y 31503 Self-heal Daemon on localhost N/A Y 31510 NFS Server on 10.54.90.13 2049Y 16310 Self-heal Daemon on 10.54.90.13 N/A Y 16317 NFS Server on hdfs6 2049Y 5006 Self-heal Daemon on hdfs6 N/A Y 5013 NFS Server on hdfs5 2049Y 26737 Self-heal Daemon on hdfs5 N/A Y 26744 Task Status of Volume edc1 -- Task : Rebalance ID : b3095ab2-c428-4681-b545-36941a8816f6 Status : completed Volume Name: edc1 Type: Distributed-Replicate Volume ID: 2f6b5804-e2d8-4400-93e9-b172952b1aae Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: fs4:/fs4/edc1 Brick2: fs5:/fs5/edc1 Brick3: hdfs5:/hdfs5/edc1 Brick4: hdfs6:/hdfs6/edc1 Options Reconfigured: performance.cache-size: 1GB performance.write-behind-window-size: 1GB volume edc1-client-0 type protocol/client option send-gids true option transport-type tcp option remote-subvolume /fs4/edc1 option remote-host fs4 option ping-timeout 42 end-volume volume edc1-client-1 type protocol/client option send-gids true option transport-type tcp option remote-subvolume /fs5/edc1 option remote-host fs5 option ping-timeout 42 end-volume volume edc1-client-2 type protocol/client option send-gids true option transport-type tcp option remote-subvolume /hdfs5/edc1 option remote-host hdfs5 option ping-timeout 42 end-volume volume edc1-client-3 type protocol/client option send-gids true option transport-type tcp option remote-subvolume /hdfs6/edc1 option remote-host hdfs6 option ping-timeout 42 end-volume volume edc1-replicate-0 type cluster/replicate subvolumes edc1-client-0 edc1-client-1 end-volume volume edc1-replicate-1 type cluster/replicate subvolumes edc1-client-2 edc1-client-3 end-volume volume edc1-dht type cluster/distribute subvolumes edc1-replicate-0 edc1-replicate-1 end-volume volume edc1-write-behind type performance/write-behind option cache-size 1GB subvolumes edc1-dht end-volume volume edc1-read-ahead type performance/read-ahead subvolumes edc1-write-behind end-volume volume edc1-io-cache type performance/io-cache option cache-size 1GB subvolumes edc1-read-ahead end-volume volume edc1-quick-read type performance/quick-read option cache-size 1GB subvolumes edc1-io-cache end-volume volume edc1-open-behind type performance/open-behind subvolumes edc1-quick-read end-volume volume edc1-md-cache type performance/md-cache subvolumes edc1-open-behind end-volume volume edc1 type debug/io-stats option count-fop-hits off option latency-measurement off subvolumes edc1-md-cache end-volume [2015-02-26 22:13:07.473839] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.local missing on subvol edc1-replicate-0 [2015-02-26 22:13:07.474890] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.local missing on subvol edc1-replicate-0 [2015-02-26 22:13:07.475891] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.local missing on subvol edc1-replicate-0 [2015-02-26 22:13:07.531037] I [dht-common.c:1822:dht_lookup_cbk] 0-edc1-dht: Entry /cc/aspera/sandbox/mwt-wea/.local missing on subvol edc1-replicate-0 [2015-02-26 22:13:07.532210] I