On Tue, Aug 30, 2016 at 8:52 AM, David Gossage <dgoss...@carouselchecks.com> wrote:
> On Tue, Aug 30, 2016 at 8:01 AM, Krutika Dhananjay <kdhan...@redhat.com> > wrote: > >> >> >> On Tue, Aug 30, 2016 at 6:20 PM, Krutika Dhananjay <kdhan...@redhat.com> >> wrote: >> >>> >>> >>> On Tue, Aug 30, 2016 at 6:07 PM, David Gossage < >>> dgoss...@carouselchecks.com> wrote: >>> >>>> On Tue, Aug 30, 2016 at 7:18 AM, Krutika Dhananjay <kdhan...@redhat.com >>>> > wrote: >>>> >>>>> Could you also share the glustershd logs? >>>>> >>>> >>>> I'll get them when I get to work sure >>>> >>> >>>> >>>>> >>>>> I tried the same steps that you mentioned multiple times, but heal is >>>>> running to completion without any issues. >>>>> >>>>> It must be said that 'heal full' traverses the files and directories >>>>> in a depth-first order and does heals also in the same order. But if it >>>>> gets interrupted in the middle (say because self-heal-daemon was either >>>>> intentionally or unintentionally brought offline and then brought back >>>>> up), >>>>> self-heal will only pick up the entries that are so far marked as >>>>> new-entries that need heal which it will find in indices/xattrop >>>>> directory. >>>>> What this means is that those files and directories that were not visited >>>>> during the crawl, will remain untouched and unhealed in this second >>>>> iteration of heal, unless you execute a 'heal-full' again. >>>>> >>>> >>>> So should it start healing shards as it crawls or not until after it >>>> crawls the entire .shard directory? At the pace it was going that could be >>>> a week with one node appearing in the cluster but with no shard files if >>>> anything tries to access a file on that node. From my experience other day >>>> telling it to heal full again did nothing regardless of node used. >>>> >>> >> Crawl is started from '/' of the volume. Whenever self-heal detects >> during the crawl that a file or directory is present in some brick(s) and >> absent in others, it creates the file on the bricks where it is absent and >> marks the fact that the file or directory might need data/entry and >> metadata heal too (this also means that an index is created under >> .glusterfs/indices/xattrop of the src bricks). And the data/entry and >> metadata heal are picked up and done in >> > the background with the help of these indices. >> > > Looking at my 3rd node as example i find nearly an exact same number of > files in xattrop dir as reported by heal count at time I brought down node2 > to try and alleviate read io errors that seemed to occur from what I was > guessing as attempts to use the node with no shards for reads. > > Also attached are the glustershd logs from the 3 nodes, along with the > test node i tried yesterday with same results. > Is it possible you just need to spam the heal full command? Wait for a certain amount of time for it to time out? The test server that I did yesterday that stopped at listing 33 shards then healing none of them stlll had 33 shards in list this morning. I issued another heal full and it jumped up and found the missing shards. On the one hand its reassuring that if I just spam the command enough eventually it will heal. It's also disconcerting that if I spam the command enough times the heal will start. I can't test if same behavior would occur on live node as I expect if it did kick in heals I'd have 12 hours of high load during copy again perhaps. But I can test if it happens after last shift. Though I lost track of how many times I tried restarting heal full over Saturday and Sunday when it looked to be doing nothing from all heal tracking commands documented. >> >>>> >>>>> My suspicion is that this is what happened on your setup. Could you >>>>> confirm if that was the case? >>>>> >>>> >>>> Brick was brought online with force start then a full heal launched. >>>> Hours later after it became evident that it was not adding new files to >>>> heal I did try restarting self-heal daemon and relaunching full heal again. >>>> But this was after the heal had basically already failed to work as >>>> intended. >>>> >>> >>> OK. How did you figure it was not adding any new files? I need to know >>> what places you were monitoring to come to this conclusion. >>> >>> -Krutika >>> >>> >>>> >>>> >>>>> As for those logs, I did manager to do something that caused these >>>>> warning messages you shared earlier to appear in my client and server >>>>> logs. >>>>> Although these logs are annoying and a bit scary too, they didn't do >>>>> any harm to the data in my volume. Why they appear just after a brick is >>>>> replaced and under no other circumstances is something I'm still >>>>> investigating. >>>>> >>>>> But for future, it would be good to follow the steps Anuradha gave as >>>>> that would allow self-heal to at least detect that it has some repairing >>>>> to >>>>> do whenever it is restarted whether intentionally or otherwise. >>>>> >>>> >>>> I followed those steps as described on my test box and ended up with >>>> exact same outcome of adding shards at an agonizing slow pace and no >>>> creation of .shard directory or heals on shard directory. Directories >>>> visible from mount healed quickly. This was with one VM so it has only 800 >>>> shards as well. After hours at work it had added a total of 33 shards to >>>> be healed. I sent those logs yesterday as well though not the glustershd. >>>> >>>> Does replace-brick command copy files in same manner? For these >>>> purposes I am contemplating just skipping the heal route. >>>> >>>> >>>>> -Krutika >>>>> >>>>> On Tue, Aug 30, 2016 at 2:22 AM, David Gossage < >>>>> dgoss...@carouselchecks.com> wrote: >>>>> >>>>>> attached brick and client logs from test machine where same behavior >>>>>> occurred not sure if anything new is there. its still on 3.8.2 >>>>>> >>>>>> Number of Bricks: 1 x 3 = 3 >>>>>> Transport-type: tcp >>>>>> Bricks: >>>>>> Brick1: 192.168.71.10:/gluster2/brick1/1 >>>>>> Brick2: 192.168.71.11:/gluster2/brick2/1 >>>>>> Brick3: 192.168.71.12:/gluster2/brick3/1 >>>>>> Options Reconfigured: >>>>>> cluster.locking-scheme: granular >>>>>> performance.strict-o-direct: off >>>>>> features.shard-block-size: 64MB >>>>>> features.shard: on >>>>>> server.allow-insecure: on >>>>>> storage.owner-uid: 36 >>>>>> storage.owner-gid: 36 >>>>>> cluster.server-quorum-type: server >>>>>> cluster.quorum-type: auto >>>>>> network.remote-dio: on >>>>>> cluster.eager-lock: enable >>>>>> performance.stat-prefetch: off >>>>>> performance.io-cache: off >>>>>> performance.quick-read: off >>>>>> cluster.self-heal-window-size: 1024 >>>>>> cluster.background-self-heal-count: 16 >>>>>> nfs.enable-ino32: off >>>>>> nfs.addr-namelookup: off >>>>>> nfs.disable: on >>>>>> performance.read-ahead: off >>>>>> performance.readdir-ahead: on >>>>>> cluster.granular-entry-heal: on >>>>>> >>>>>> >>>>>> >>>>>> On Mon, Aug 29, 2016 at 2:20 PM, David Gossage < >>>>>> dgoss...@carouselchecks.com> wrote: >>>>>> >>>>>>> On Mon, Aug 29, 2016 at 7:01 AM, Anuradha Talur <ata...@redhat.com> >>>>>>> wrote: >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ----- Original Message ----- >>>>>>>> > From: "David Gossage" <dgoss...@carouselchecks.com> >>>>>>>> > To: "Anuradha Talur" <ata...@redhat.com> >>>>>>>> > Cc: "gluster-users@gluster.org List" <Gluster-users@gluster.org>, >>>>>>>> "Krutika Dhananjay" <kdhan...@redhat.com> >>>>>>>> > Sent: Monday, August 29, 2016 5:12:42 PM >>>>>>>> > Subject: Re: [Gluster-users] 3.8.3 Shards Healing Glacier Slow >>>>>>>> > >>>>>>>> > On Mon, Aug 29, 2016 at 5:39 AM, Anuradha Talur < >>>>>>>> ata...@redhat.com> wrote: >>>>>>>> > >>>>>>>> > > Response inline. >>>>>>>> > > >>>>>>>> > > ----- Original Message ----- >>>>>>>> > > > From: "Krutika Dhananjay" <kdhan...@redhat.com> >>>>>>>> > > > To: "David Gossage" <dgoss...@carouselchecks.com> >>>>>>>> > > > Cc: "gluster-users@gluster.org List" < >>>>>>>> Gluster-users@gluster.org> >>>>>>>> > > > Sent: Monday, August 29, 2016 3:55:04 PM >>>>>>>> > > > Subject: Re: [Gluster-users] 3.8.3 Shards Healing Glacier Slow >>>>>>>> > > > >>>>>>>> > > > Could you attach both client and brick logs? Meanwhile I will >>>>>>>> try these >>>>>>>> > > steps >>>>>>>> > > > out on my machines and see if it is easily recreatable. >>>>>>>> > > > >>>>>>>> > > > -Krutika >>>>>>>> > > > >>>>>>>> > > > On Mon, Aug 29, 2016 at 2:31 PM, David Gossage < >>>>>>>> > > dgoss...@carouselchecks.com >>>>>>>> > > > > wrote: >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> > > > Centos 7 Gluster 3.8.3 >>>>>>>> > > > >>>>>>>> > > > Brick1: ccgl1.gl.local:/gluster1/BRICK1/1 >>>>>>>> > > > Brick2: ccgl2.gl.local:/gluster1/BRICK1/1 >>>>>>>> > > > Brick3: ccgl4.gl.local:/gluster1/BRICK1/1 >>>>>>>> > > > Options Reconfigured: >>>>>>>> > > > cluster.data-self-heal-algorithm: full >>>>>>>> > > > cluster.self-heal-daemon: on >>>>>>>> > > > cluster.locking-scheme: granular >>>>>>>> > > > features.shard-block-size: 64MB >>>>>>>> > > > features.shard: on >>>>>>>> > > > performance.readdir-ahead: on >>>>>>>> > > > storage.owner-uid: 36 >>>>>>>> > > > storage.owner-gid: 36 >>>>>>>> > > > performance.quick-read: off >>>>>>>> > > > performance.read-ahead: off >>>>>>>> > > > performance.io-cache: off >>>>>>>> > > > performance.stat-prefetch: on >>>>>>>> > > > cluster.eager-lock: enable >>>>>>>> > > > network.remote-dio: enable >>>>>>>> > > > cluster.quorum-type: auto >>>>>>>> > > > cluster.server-quorum-type: server >>>>>>>> > > > server.allow-insecure: on >>>>>>>> > > > cluster.self-heal-window-size: 1024 >>>>>>>> > > > cluster.background-self-heal-count: 16 >>>>>>>> > > > performance.strict-write-ordering: off >>>>>>>> > > > nfs.disable: on >>>>>>>> > > > nfs.addr-namelookup: off >>>>>>>> > > > nfs.enable-ino32: off >>>>>>>> > > > cluster.granular-entry-heal: on >>>>>>>> > > > >>>>>>>> > > > Friday did rolling upgrade from 3.8.3->3.8.3 no issues. >>>>>>>> > > > Following steps detailed in previous recommendations began >>>>>>>> proces of >>>>>>>> > > > replacing and healngbricks one node at a time. >>>>>>>> > > > >>>>>>>> > > > 1) kill pid of brick >>>>>>>> > > > 2) reconfigure brick from raid6 to raid10 >>>>>>>> > > > 3) recreate directory of brick >>>>>>>> > > > 4) gluster volume start <> force >>>>>>>> > > > 5) gluster volume heal <> full >>>>>>>> > > Hi, >>>>>>>> > > >>>>>>>> > > I'd suggest that full heal is not used. There are a few bugs in >>>>>>>> full heal. >>>>>>>> > > Better safe than sorry ;) >>>>>>>> > > Instead I'd suggest the following steps: >>>>>>>> > > >>>>>>>> > > Currently I brought the node down by systemctl stop glusterd as >>>>>>>> I was >>>>>>>> > getting sporadic io issues and a few VM's paused so hoping that >>>>>>>> will help. >>>>>>>> > I may wait to do this till around 4PM when most work is done in >>>>>>>> case it >>>>>>>> > shoots load up. >>>>>>>> > >>>>>>>> > >>>>>>>> > > 1) kill pid of brick >>>>>>>> > > 2) to configuring of brick that you need >>>>>>>> > > 3) recreate brick dir >>>>>>>> > > 4) while the brick is still down, from the mount point: >>>>>>>> > > a) create a dummy non existent dir under / of mount. >>>>>>>> > > >>>>>>>> > >>>>>>>> > so if noee 2 is down brick, pick node for example 3 and make a >>>>>>>> test dir >>>>>>>> > under its brick directory that doesnt exist on 2 or should I be >>>>>>>> dong this >>>>>>>> > over a gluster mount? >>>>>>>> You should be doing this over gluster mount. >>>>>>>> > >>>>>>>> > > b) set a non existent extended attribute on / of mount. >>>>>>>> > > >>>>>>>> > >>>>>>>> > Could you give me an example of an attribute to set? I've read >>>>>>>> a tad on >>>>>>>> > this, and looked up attributes but haven't set any yet myself. >>>>>>>> > >>>>>>>> Sure. setfattr -n "user.some-name" -v "some-value" <path-to-mount> >>>>>>>> > Doing these steps will ensure that heal happens only from updated >>>>>>>> brick to >>>>>>>> > > down brick. >>>>>>>> > > 5) gluster v start <> force >>>>>>>> > > 6) gluster v heal <> >>>>>>>> > > >>>>>>>> > >>>>>>>> > Will it matter if somewhere in gluster the full heal command was >>>>>>>> run other >>>>>>>> > day? Not sure if it eventually stops or times out. >>>>>>>> > >>>>>>>> full heal will stop once the crawl is done. So if you want to >>>>>>>> trigger heal again, >>>>>>>> run gluster v heal <>. Actually even brick up or volume start force >>>>>>>> should >>>>>>>> trigger the heal. >>>>>>>> >>>>>>> >>>>>>> Did this on test bed today. its one server with 3 bricks on same >>>>>>> machine so take that for what its worth. also it still runs 3.8.2. >>>>>>> Maybe >>>>>>> ill update and re-run test. >>>>>>> >>>>>>> killed brick >>>>>>> deleted brick dir >>>>>>> recreated brick dir >>>>>>> created fake dir on gluster mount >>>>>>> set suggested fake attribute on it >>>>>>> ran volume start <> force >>>>>>> >>>>>>> looked at files it said needed healing and it was just 8 shards that >>>>>>> were modified for few minutes I ran through steps >>>>>>> >>>>>>> gave it few minutes and it stayed same >>>>>>> ran gluster volume <> heal >>>>>>> >>>>>>> it healed all the directories and files you can see over mount >>>>>>> including fakedir. >>>>>>> >>>>>>> same issue for shards though. it adds more shards to heal at >>>>>>> glacier pace. slight jump in speed if I stat every file and dir in VM >>>>>>> running but not all shards. >>>>>>> >>>>>>> It started with 8 shards to heal and is now only at 33 out of 800 >>>>>>> and probably wont finish adding for few days at rate it goes. >>>>>>> >>>>>>> >>>>>>> >>>>>>>> > > >>>>>>>> > > > 1st node worked as expected took 12 hours to heal 1TB data. >>>>>>>> Load was >>>>>>>> > > little >>>>>>>> > > > heavy but nothing shocking. >>>>>>>> > > > >>>>>>>> > > > About an hour after node 1 finished I began same process on >>>>>>>> node2. Heal >>>>>>>> > > > proces kicked in as before and the files in directories >>>>>>>> visible from >>>>>>>> > > mount >>>>>>>> > > > and .glusterfs healed in short time. Then it began crawl of >>>>>>>> .shard adding >>>>>>>> > > > those files to heal count at which point the entire proces >>>>>>>> ground to a >>>>>>>> > > halt >>>>>>>> > > > basically. After 48 hours out of 19k shards it has added 5900 >>>>>>>> to heal >>>>>>>> > > list. >>>>>>>> > > > Load on all 3 machnes is negligible. It was suggested to >>>>>>>> change this >>>>>>>> > > value >>>>>>>> > > > to full cluster.data-self-heal-algorithm and restart volume >>>>>>>> which I >>>>>>>> > > did. No >>>>>>>> > > > efffect. Tried relaunching heal no effect, despite any node >>>>>>>> picked. I >>>>>>>> > > > started each VM and performed a stat of all files from within >>>>>>>> it, or a >>>>>>>> > > full >>>>>>>> > > > virus scan and that seemed to cause short small spikes in >>>>>>>> shards added, >>>>>>>> > > but >>>>>>>> > > > not by much. Logs are showing no real messages indicating >>>>>>>> anything is >>>>>>>> > > going >>>>>>>> > > > on. I get hits to brick log on occasion of null lookups >>>>>>>> making me think >>>>>>>> > > its >>>>>>>> > > > not really crawling shards directory but waiting for a shard >>>>>>>> lookup to >>>>>>>> > > add >>>>>>>> > > > it. I'll get following in brick log but not constant and >>>>>>>> sometime >>>>>>>> > > multiple >>>>>>>> > > > for same shard. >>>>>>>> > > > >>>>>>>> > > > [2016-08-29 08:31:57.478125] W [MSGID: 115009] >>>>>>>> > > > [server-resolve.c:569:server_resolve] 0-GLUSTER1-server: no >>>>>>>> resolution >>>>>>>> > > type >>>>>>>> > > > for (null) (LOOKUP) >>>>>>>> > > > [2016-08-29 08:31:57.478170] E [MSGID: 115050] >>>>>>>> > > > [server-rpc-fops.c:156:server_lookup_cbk] 0-GLUSTER1-server: >>>>>>>> 12591783: >>>>>>>> > > > LOOKUP (null) (00000000-0000-0000-00 >>>>>>>> > > > 00-000000000000/241a55ed-f0d5-4dbc-a6ce-ab784a0ba6ff.221) >>>>>>>> ==> (Invalid >>>>>>>> > > > argument) [Invalid argument] >>>>>>>> > > > >>>>>>>> > > > This one repeated about 30 times in row then nothing for 10 >>>>>>>> minutes then >>>>>>>> > > one >>>>>>>> > > > hit for one different shard by itself. >>>>>>>> > > > >>>>>>>> > > > How can I determine if Heal is actually running? How can I >>>>>>>> kill it or >>>>>>>> > > force >>>>>>>> > > > restart? Does node I start it from determine which directory >>>>>>>> gets >>>>>>>> > > crawled to >>>>>>>> > > > determine heals? >>>>>>>> > > > >>>>>>>> > > > David Gossage >>>>>>>> > > > Carousel Checks Inc. | System Administrator >>>>>>>> > > > Office 708.613.2284 >>>>>>>> > > > >>>>>>>> > > > _______________________________________________ >>>>>>>> > > > Gluster-users mailing list >>>>>>>> > > > Gluster-users@gluster.org >>>>>>>> > > > http://www.gluster.org/mailman/listinfo/gluster-users >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> > > > _______________________________________________ >>>>>>>> > > > Gluster-users mailing list >>>>>>>> > > > Gluster-users@gluster.org >>>>>>>> > > > http://www.gluster.org/mailman/listinfo/gluster-users >>>>>>>> > > >>>>>>>> > > -- >>>>>>>> > > Thanks, >>>>>>>> > > Anuradha. >>>>>>>> > > >>>>>>>> > >>>>>>>> >>>>>>>> -- >>>>>>>> Thanks, >>>>>>>> Anuradha. >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users