On Mon, Aug 29, 2016 at 7:14 AM, David Gossage <dgoss...@carouselchecks.com> wrote:
> On Mon, Aug 29, 2016 at 5:25 AM, Krutika Dhananjay <kdhan...@redhat.com> > wrote: > >> Could you attach both client and brick logs? Meanwhile I will try these >> steps out on my machines and see if it is easily recreatable. >> >> > Hoping 7z files are accepted by mail server. > Also didnt do translation of timezones but in CST I started the node 1 heal 2016-08-26 20:26:42, and then the next morning i started initall node 2 heal at 2016-08-27 07:58:34 > > -Krutika >> >> On Mon, Aug 29, 2016 at 2:31 PM, David Gossage < >> dgoss...@carouselchecks.com> wrote: >> >>> Centos 7 Gluster 3.8.3 >>> >>> Brick1: ccgl1.gl.local:/gluster1/BRICK1/1 >>> Brick2: ccgl2.gl.local:/gluster1/BRICK1/1 >>> Brick3: ccgl4.gl.local:/gluster1/BRICK1/1 >>> Options Reconfigured: >>> cluster.data-self-heal-algorithm: full >>> cluster.self-heal-daemon: on >>> cluster.locking-scheme: granular >>> features.shard-block-size: 64MB >>> features.shard: on >>> performance.readdir-ahead: on >>> storage.owner-uid: 36 >>> storage.owner-gid: 36 >>> performance.quick-read: off >>> performance.read-ahead: off >>> performance.io-cache: off >>> performance.stat-prefetch: on >>> cluster.eager-lock: enable >>> network.remote-dio: enable >>> cluster.quorum-type: auto >>> cluster.server-quorum-type: server >>> server.allow-insecure: on >>> cluster.self-heal-window-size: 1024 >>> cluster.background-self-heal-count: 16 >>> performance.strict-write-ordering: off >>> nfs.disable: on >>> nfs.addr-namelookup: off >>> nfs.enable-ino32: off >>> cluster.granular-entry-heal: on >>> >>> Friday did rolling upgrade from 3.8.3->3.8.3 no issues. >>> Following steps detailed in previous recommendations began proces of >>> replacing and healngbricks one node at a time. >>> >>> 1) kill pid of brick >>> 2) reconfigure brick from raid6 to raid10 >>> 3) recreate directory of brick >>> 4) gluster volume start <> force >>> 5) gluster volume heal <> full >>> >>> 1st node worked as expected took 12 hours to heal 1TB data. Load was >>> little heavy but nothing shocking. >>> >>> About an hour after node 1 finished I began same process on node2. Heal >>> proces kicked in as before and the files in directories visible from mount >>> and .glusterfs healed in short time. Then it began crawl of .shard adding >>> those files to heal count at which point the entire proces ground to a halt >>> basically. After 48 hours out of 19k shards it has added 5900 to heal >>> list. Load on all 3 machnes is negligible. It was suggested to change >>> this value to full cluster.data-self-heal-algorithm and restart volume >>> which I did. No efffect. Tried relaunching heal no effect, despite any >>> node picked. I started each VM and performed a stat of all files from >>> within it, or a full virus scan and that seemed to cause short small >>> spikes in shards added, but not by much. Logs are showing no real messages >>> indicating anything is going on. I get hits to brick log on occasion of >>> null lookups making me think its not really crawling shards directory but >>> waiting for a shard lookup to add it. I'll get following in brick log but >>> not constant and sometime multiple for same shard. >>> >>> [2016-08-29 08:31:57.478125] W [MSGID: 115009] >>> [server-resolve.c:569:server_resolve] 0-GLUSTER1-server: no resolution >>> type for (null) (LOOKUP) >>> [2016-08-29 08:31:57.478170] E [MSGID: 115050] >>> [server-rpc-fops.c:156:server_lookup_cbk] 0-GLUSTER1-server: 12591783: >>> LOOKUP (null) (00000000-0000-0000-00 >>> 00-000000000000/241a55ed-f0d5-4dbc-a6ce-ab784a0ba6ff.221) ==> (Invalid >>> argument) [Invalid argument] >>> >>> This one repeated about 30 times in row then nothing for 10 minutes then >>> one hit for one different shard by itself. >>> >>> How can I determine if Heal is actually running? How can I kill it or >>> force restart? Does node I start it from determine which directory gets >>> crawled to determine heals? >>> >>> *David Gossage* >>> *Carousel Checks Inc. | System Administrator* >>> *Office* 708.613.2284 >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users@gluster.org >>> http://www.gluster.org/mailman/listinfo/gluster-users >>> >> >> >
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users