> > @Artem what is the average size of the files for your apaches ? >
The average size is probably 15-20MB, but the files are as large as 100MB+. The files are a combination of Android APK files and image files. Sincerely, Artem -- Founder, Android Police <http://www.androidpolice.com>, APK Mirror <http://www.apkmirror.com/>, Illogical Robot LLC beerpla.net | @ArtemR <http://twitter.com/ArtemR> On Fri, May 15, 2020 at 11:20 PM Strahil Nikolov <[email protected]> wrote: > On May 16, 2020 1:14:17 AM GMT+03:00, Artem Russakovskii < > [email protected]> wrote: > >Hi Hari, > > > >Thanks for the analysis. > > > > 1. I understand why the rebooted node would have 0 heal state files > >whereas the other nodes would be going up. The problem with 5.11 was > >that > > there was a bug that prevented the heal from completing, which as I > > mentioned was fixed in 5.13. > > > > 2. If the number of files to heal is known, why are operations like > >md5sum needed to perform the heal at all? Why doesn't the auto heal > >agent > >just go through the list one file at a time and perform the heal, then > >mark > >the files as healed? From what I've seen, even requesting a manual heal > >didn't do anything until those files were touched in some way (like > >md5sum). > > > > 3. cscope? I am unable to find what you're talking about - > > http://cscope.sourceforge.net/ seems to be a code search tool? > > > > 4. What's the best way to analyze and present the data about what's > > raising the load to 100+ on all nodes after reboots? If there's some > >monitoring tool I could run, reproduce the issue, then save the output > >and > > send it here, that'd be great. > > > > 5. Based on what I've seen when I straced apache processes, they would > >all hang for a long time when they ran across some of the gluster data. > > Specifically, one of the directories (with only several files, nothing > > special) which may be queried in some way by a lot of page loads (for > >context, we have 2 busy WordPress sites), would come up a lot in > >straces > >and hang. I even tried moving this directory out of gluster and adding > >a > > symlink, but I'm not sure that helped. I wonder if multiple conditions > >cause frequent read access to a certain resource in gluster to spike > >out of > > control and go haywire. > >Here, the gluster root directory is SITE/public/wp-content/uploads, and > > the dirs I saw the most are symlinked as follows: > > lrwxrwxrwx 1 archon810 users 73 Apr 26 15:47 > > wp-security-audit-log -> SITE/public/wp-content/wp-security-audit-log/ > >(belongs to https://wordpress.org/plugins/wp-security-audit-log/ but > >the > > Pro version) > > > > A sample strace log of requests to this dir: > > [2020-05-15 15:10:15 PDT] > > stat("SITE/public/wp-content/uploads/wp-security-audit-log", > > {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 > > [2020-05-15 15:10:15 PDT] > >access("SITE/public/wp-content/uploads/wp-security-audit-log", R_OK) = > >0 > > [2020-05-15 15:10:15 PDT] > > >access("SITE/public/wp-content/uploads/wp-security-audit-log/custom-alerts.php", > > F_OK) = -1 ENOENT (No such file or directory) > > [2020-05-15 15:10:15 PDT] > > >stat("SITE/public/wp-content/uploads/wp-security-audit-log/custom-sensors/", > > 0x7ffeff14c4d0) = -1 ENOENT (No such file or directory) > > [2020-05-15 15:10:18 PDT] > > stat("SITE/public/wp-content/uploads/wp-security-audit-log", > > {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0 > > [2020-05-15 15:10:18 PDT] > >access("SITE/public/wp-content/uploads/wp-security-audit-log", R_OK) = > >0 > > [2020-05-15 15:10:18 PDT] > > >access("SITE/public/wp-content/uploads/wp-security-audit-log/custom-alerts.php", > > F_OK) = -1 ENOENT (No such file or directory) > > [2020-05-15 15:10:18 PDT] > > >stat("SITE/public/wp-content/uploads/wp-security-audit-log/custom-sensors/", > > 0x7ffeff14c4d0) = -1 ENOENT (No such file or directory) > > [2020-05-15 15:10:19 PDT] > > stat("SITE/public/wp-content/uploads/wp-security-audit-log", > > {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 > > [2020-05-15 15:10:19 PDT] > >access("SITE/public/wp-content/uploads/wp-security-audit-log", R_OK) = > >0 > > [2020-05-15 15:10:19 PDT] > > >access("SITE/public/wp-content/uploads/wp-security-audit-log/custom-alerts.php", > > F_OK) = -1 ENOENT (No such file or directory) > > [2020-05-15 15:10:19 PDT] > > >stat("SITE/public/wp-content/uploads/wp-security-audit-log/custom-sensors/", > > 0x7ffeff14c4d0) = -1 ENOENT (No such file or directory) > > [2020-05-15 15:10:21 PDT] > > stat("SITE/public/wp-content/uploads/wp-security-audit-log", > > {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0 > > [2020-05-15 15:10:21 PDT] > >access("SITE/public/wp-content/uploads/wp-security-audit-log", R_OK) = > >0 > > [2020-05-15 15:10:21 PDT] > > >access("SITE/public/wp-content/uploads/wp-security-audit-log/custom-alerts.php", > > F_OK) = -1 ENOENT (No such file or directory) > > [2020-05-15 15:10:21 PDT] > > >stat("SITE/public/wp-content/uploads/wp-security-audit-log/custom-sensors/", > > 0x7ffeff14c4d0) = -1 ENOENT (No such file or directory) > > [2020-05-15 15:10:23 PDT] > > stat("SITE/public/wp-content/uploads/wp-security-audit-log", > > {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 > > [2020-05-15 15:10:23 PDT] > >access("SITE/public/wp-content/uploads/wp-security-audit-log", R_OK) = > >0 > > [2020-05-15 15:10:23 PDT] > > >access("SITE/public/wp-content/uploads/wp-security-audit-log/custom-alerts.php", > > F_OK) = -1 ENOENT (No such file or directory) > > [2020-05-15 15:10:23 PDT] > > >stat("SITE/public/wp-content/uploads/wp-security-audit-log/custom-sensors/", > > 0x7ffeff14c4d0) = -1 ENOENT (No such file or directory) > > [2020-05-15 15:10:25 PDT] > > stat("SITE/public/wp-content/uploads/wp-security-audit-log", > > {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 > > [2020-05-15 15:10:25 PDT] > >access("SITE/public/wp-content/uploads/wp-security-audit-log", R_OK) = > >0 > > [2020-05-15 15:10:25 PDT] > > >access("SITE/public/wp-content/uploads/wp-security-audit-log/custom-alerts.php", > > F_OK) = -1 ENOENT (No such file or directory) > > [2020-05-15 15:10:25 PDT] > > >stat("SITE/public/wp-content/uploads/wp-security-audit-log/custom-sensors/", > > 0x7ffeff14c4d0) = -1 ENOENT (No such file or directory) > > [2020-05-15 15:10:27 PDT] > > stat("SITE/public/wp-content/uploads/wp-security-audit-log", > > {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0 > > [2020-05-15 15:10:27 PDT] > >access("SITE/public/wp-content/uploads/wp-security-audit-log", R_OK) = > >0 > > [2020-05-15 15:10:27 PDT] > > >access("SITE/public/wp-content/uploads/wp-security-audit-log/custom-alerts.php", > > F_OK) = -1 ENOENT (No such file or directory) > > [2020-05-15 15:10:27 PDT] > > >stat("SITE/public/wp-content/uploads/wp-security-audit-log/custom-sensors/", > > 0x7ffeff14c4d0) = -1 ENOENT (No such file or directory) > > > >The load spikes and everything hanging is seriously stressing me out > >because at any point it can cause an outage for us. > > > >Sincerely, > >Artem > > > >-- > >Founder, Android Police <http://www.androidpolice.com>, APK Mirror > ><http://www.apkmirror.com/>, Illogical Robot LLC > >beerpla.net | @ArtemR <http://twitter.com/ArtemR> > > > > > >On Thu, May 7, 2020 at 11:32 PM Hari Gowtham <[email protected]> > >wrote: > > > >> The heal info is working fine. The explanation to what's happening: > >> When a node goes down, the changes to this node can't be done. So on > >the > >> other nodes which were up, get the changes and keeps track saying > >> these files were changed (note: this change hasn't been reflected to > >the > >> node which was down). Once the node down comes back up, > >> it doesn't know what happened when it was down. But the other nodes > >know > >> that there are a few changes which didn't make it to the rebooted > >node. > >> So the node down is blamed by the other nodes .This is what is shown > >in > >> the heal info. As the node which was up doesn't have any change that > >went > >> into that node alone. > >> It says 0 files to be healed and the other nodes as it has the data > >say > >> which are the files that need to heal. > >> This is the expected working. > >> So as per the rebooted node, heal info is working fine. > >> > >> About healing the file itself: > >> Doing an operation on a file, triggers client side heal as per the > >design, > >> that's the reason these files are getting corrected after the md5sum > >(I > >> hope this is done from the client side not the backend itself). > >> So this is expected. > >> About the heals not happening for a long time, there can be some > >issue > >> there. > >> @Karthik Subrahmanya <[email protected]> is the better person to > >help > >> you with this. > >> > >> About the CPU usage going higher: > >> We need info about what is consuming more CPU. > >> Glusterd needs to do a bit of handshake and connect after reboot. > >During > >> this a little bit of data is transferred as well. > >> If the number of nodes goes higher it can contribute to hike. > >> Similarly, if the heal is happening, then it can increase the usage. > >> So we need info about what is consuming the cpu to know if it's > >expected > >> or not. > >> If this hike is expected, you can try using cscope to restrict the > >cpu > >> usage by that particular process. > >> > >> On Tue, Apr 28, 2020 at 3:02 AM Artem Russakovskii > ><[email protected]> > >> wrote: > >> > >>> Good news, after upgrading to 5.13 and running this scenario again, > >the > >>> self heal actually succeeded without my intervention following a > >server > >>> reboot. > >>> > >>> The load was still high during this process, but at least the > >endless > >>> heal issue is resolved. > >>> > >>> I'd still love to hear from the team on managing heal load spikes. > >>> > >>> Sincerely, > >>> Artem > >>> > >>> -- > >>> Founder, Android Police <http://www.androidpolice.com>, APK Mirror > >>> <http://www.apkmirror.com/>, Illogical Robot LLC > >>> beerpla.net | @ArtemR <http://twitter.com/ArtemR> > >>> > >>> > >>> On Sun, Apr 26, 2020 at 3:13 PM Artem Russakovskii > ><[email protected]> > >>> wrote: > >>> > >>>> Hi all, > >>>> > >>>> I've been observing this problem for a long time now and it's time > >to > >>>> finally figure out what's going on. > >>>> > >>>> We're running gluster 5.11 and have a 10TB 1 x 4 = 4 replicate > >volume. > >>>> I'll include its slightly redacted config below. > >>>> > >>>> When I reboot one of the servers and it goes offline for a bit, > >when it > >>>> comes back, heal info tells me there are some files and dirs that > >are "heal > >>>> pending". 0 "split-brain" and "possibly healing" - only "heal > >pending" > >>>> are >0. > >>>> > >>>> 1. For some reason, the server that was rebooted shows "heal > >>>> pending" 0. All other servers show "heal pending" with some > >number, say 65. > >>>> 2. We have cluster.self-heal-daemon enabled. > >>>> 3. The logs are full of "performing entry selfheal" and > >"completed > >>>> entry selfheal" messages that continue to print endlessly. > >>>> 4. This "heal pending" number never goes down by itself, but it > >does > >>>> if I run some operation on it, like md5sum. > >>>> 5. When the server goes down for reboot and especially when it > >comes > >>>> back, the load on ALL servers shoots up through the roof (load > >of 100+) and > >>>> ends up bringing everything down, including apache and nginx. My > >theory is > >>>> that self-heal kicks in so hard that it kills IO on these > >attached Linode > >>>> block devices. However, after some time - say 10 minutes - the > >load > >>>> subsides, but the "heal pending" remains and the gluster logs > >continue to > >>>> output "performing entry selfheal" and "completed entry > >selfheal" messages. > >>>> This load spike has become a huge issue for us because it brings > >down the > >>>> whole site for entire minutes. > >>>> 6. At this point in my investigation, I noticed that the > >>>> selfheal messages actually repeat for the same gfids over and > >over. > >>>> [2020-04-26 21:32:29.877987] I [MSGID: 108026] > >>>> [afr-self-heal-entry.c:897:afr_selfheal_entry_do] > >0-SNIP_data1-replicate-0: > >>>> performing entry selfheal on > >96d282cf-402f-455c-9add-5f03c088a1bc > >>>> [2020-04-26 21:32:29.901246] I [MSGID: 108026] > >>>> [afr-self-heal-common.c:1729:afr_log_selfheal] > >0-SNIP_data1-replicate-0: > >>>> Completed entry selfheal on > >96d282cf-402f-455c-9add-5f03c088a1bc. sources= > >>>> sinks=0 1 2 > >>>> [2020-04-26 21:32:32.171959] I [MSGID: 108026] > >>>> [afr-self-heal-entry.c:897:afr_selfheal_entry_do] > >0-SNIP_data1-replicate-0: > >>>> performing entry selfheal on > >96d282cf-402f-455c-9add-5f03c088a1bc > >>>> [2020-04-26 21:32:32.225828] I [MSGID: 108026] > >>>> [afr-self-heal-common.c:1729:afr_log_selfheal] > >0-SNIP_data1-replicate-0: > >>>> Completed entry selfheal on > >96d282cf-402f-455c-9add-5f03c088a1bc. sources= > >>>> sinks=0 1 2 > >>>> [2020-04-26 21:32:33.346990] I [MSGID: 108026] > >>>> [afr-self-heal-entry.c:897:afr_selfheal_entry_do] > >0-SNIP_data1-replicate-0: > >>>> performing entry selfheal on > >96d282cf-402f-455c-9add-5f03c088a1bc > >>>> [2020-04-26 21:32:33.374413] I [MSGID: 108026] > >>>> [afr-self-heal-common.c:1729:afr_log_selfheal] > >0-SNIP_data1-replicate-0: > >>>> Completed entry selfheal on > >96d282cf-402f-455c-9add-5f03c088a1bc. sources= > >>>> sinks=0 1 2 > >>>> 7. I used gfid-resolver.sh from > >https://gist.github.com/4392640.git > >>>> to resolve this gfid to the real location and yup - it was one > >of the files > >>>> (a dir actually) listed as "heal pending" in heal info. As soon > >as I ran > >>>> md5sum on the file inside (which was also listed in "heal > >pending"), the > >>>> log messages stopped repeating for this entry and it disappeared > >from "heal > >>>> pending" heal info. These were the final log lines: > >>>> [2020-04-26 21:32:35.642662] I [MSGID: 108026] > >>>> [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do] > >>>> 0-SNIP_data1-replicate-0: performing metadata selfheal on > >>>> 96d282cf-402f-455c-9add-5f03c088a1bc > >>>> [2020-04-26 21:32:35.658714] I [MSGID: 108026] > >>>> [afr-self-heal-common.c:1729:afr_log_selfheal] > >0-SNIP_data1-replicate-0: > >>>> Completed metadata selfheal on > >96d282cf-402f-455c-9add-5f03c088a1bc. > >>>> sources=0 [1] 2 sinks=3 > >>>> [2020-04-26 21:32:35.686509] I [MSGID: 108026] > >>>> [afr-self-heal-entry.c:897:afr_selfheal_entry_do] > >0-SNIP_data1-replicate-0: > >>>> performing entry selfheal on > >96d282cf-402f-455c-9add-5f03c088a1bc > >>>> [2020-04-26 21:32:35.720387] I [MSGID: 108026] > >>>> [afr-self-heal-common.c:1729:afr_log_selfheal] > >0-SNIP_data1-replicate-0: > >>>> Completed entry selfheal on > >96d282cf-402f-455c-9add-5f03c088a1bc. sources=0 > >>>> [1] 2 sinks=3 > >>>> > >>>> I have to repeat this song and dance every time I reboot servers > >and run > >>>> md5sum on each "heal pending" file or else the messages will > >continue > >>>> presumably indefinitely. In the meantime, the files seem to be fine > >when > >>>> accessed. > >>>> > >>>> What I don't understand is: > >>>> > >>>> 1. Why doesn't gluster just heal them properly instead of > >getting > >>>> stuck? Or maybe this was fixed in v6 or v7, which I haven't > >upgraded to due > >>>> to waiting for another unrelated issue to be fixed? > >>>> 2. Why does heal info show 0 "heal pending" files on the server > >that > >>>> was rebooted, but all other servers show the same number of > >"heal pending" > >>>> entries >0? > >>>> 3. Why are there these insane load spikes upon going down and > >>>> especially coming back online? Is it related to the issue here? > >I'm pretty > >>>> sure that it didn't happen in previous versions of gluster, when > >this issue > >>>> didn't manifest - I could easily bring down one of the servers > >without it > >>>> creating havoc when it comes back online. > >>>> > >>>> Here's the volume info: > >>>> > >>>> Volume Name: SNIP_data1 > >>>> > >>>> Type: Replicate > >>>> > >>>> Volume ID: 11ecee7e-d4f8-497a-9994-ceb144d6841e > >>>> > >>>> Status: Started > >>>> > >>>> Snapshot Count: 0 > >>>> > >>>> Number of Bricks: 1 x 4 = 4 > >>>> > >>>> Transport-type: tcp > >>>> > >>>> Bricks: > >>>> > >>>> Brick1: SNIP:/mnt/SNIP_block1/SNIP_data1 > >>>> > >>>> Brick2: SNIP:/mnt/SNIP_block1/SNIP_data1 > >>>> > >>>> Brick3: SNIP:/mnt/SNIP_block1/SNIP_data1 > >>>> > >>>> Brick4: SNIP:/mnt/SNIP_block1/SNIP_data1 > >>>> > >>>> Options Reconfigured: > >>>> > >>>> performance.client-io-threads: on > >>>> > >>>> nfs.disable: on > >>>> > >>>> transport.address-family: inet > >>>> > >>>> cluster.self-heal-daemon: enable > >>>> > >>>> performance.cache-size: 1GB > >>>> > >>>> cluster.lookup-optimize: on > >>>> > >>>> performance.read-ahead: off > >>>> > >>>> client.event-threads: 4 > >>>> > >>>> server.event-threads: 4 > >>>> > >>>> performance.io-thread-count: 32 > >>>> > >>>> cluster.readdir-optimize: on > >>>> > >>>> features.cache-invalidation: on > >>>> > >>>> features.cache-invalidation-timeout: 600 > >>>> > >>>> performance.stat-prefetch: on > >>>> > >>>> performance.cache-invalidation: on > >>>> > >>>> performance.md-cache-timeout: 600 > >>>> > >>>> network.inode-lru-limit: 500000 > >>>> > >>>> performance.parallel-readdir: on > >>>> > >>>> performance.readdir-ahead: on > >>>> > >>>> performance.rda-cache-limit: 256MB > >>>> > >>>> network.remote-dio: enable > >>>> > >>>> network.ping-timeout: 5 > >>>> > >>>> cluster.quorum-type: fixed > >>>> > >>>> cluster.quorum-count: 1 > >>>> > >>>> cluster.granular-entry-heal: enable > >>>> > >>>> cluster.data-self-heal-algorithm: full > >>>> > >>>> > >>>> Appreciate any insight. Thank you. > >>>> > >>>> Sincerely, > >>>> Artem > >>>> > >>>> -- > >>>> Founder, Android Police <http://www.androidpolice.com>, APK Mirror > >>>> <http://www.apkmirror.com/>, Illogical Robot LLC > >>>> beerpla.net | @ArtemR <http://twitter.com/ArtemR> > >>>> > >>> ________ > >>> > >>> > >>> > >>> Community Meeting Calendar: > >>> > >>> Schedule - > >>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > >>> Bridge: https://bluejeans.com/441850968 > >>> > >>> Gluster-users mailing list > >>> [email protected] > >>> https://lists.gluster.org/mailman/listinfo/gluster-users > >>> > >> > >> > >> -- > >> Regards, > >> Hari Gowtham. > >> > > Hi Hari, > > I can confirm that I have observed what Artem has described on v7.X . > When one of my Gluster nodes was down for some time (more data for > heal) and powered up - the node is barely responsive over ssh (separate > network than the gluster one) and the system is seriously loaded untill > the heal is over. > I'm using the defaults for healing options. > Does the healing process require some checksumming on blamed entries ? > > > @Artem, Gluster has 2 kinds of healing. > A) FUSE clients can cause a healing of a file which is not the same on all > bricks > This is why md5sum causes a heal. > Usually even a simple 'stat' will trigger that, but I have noticed that > Gluster with sharding might require reading the file with an offset that > matches the shard in order this type of heal to work. > > B) There is a heal daemon that runs every 15min (or somewhere there) > which crawls over the entries blamed and triggers healing . > > Also, as far as I know, each file that is being healed is also locked > for the duration of the heal. That was the reason why oVirt uses sharding > , so instead of the whole disk being locked - only a small piece (shard ) > is locked untill healed. > > @Artem what is the average size of the files for your apaches ? > > Best Regards, > Strahil Nikolov > >
________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://bluejeans.com/441850968 Gluster-users mailing list [email protected] https://lists.gluster.org/mailman/listinfo/gluster-users
