Re: btrfs send extremely slow (almost stuck)
At 09/05/2016 05:41 AM, Oliver Freyermuth wrote: Am 30.08.2016 um 02:48 schrieb Qu Wenruo: Yes. And more specifically, it doesn't even affect delta backup. For shared extents caused by reflink/dedupe(out-of-band or even incoming in-band), it will be send as individual files. For contents, they are all the same, just more space usage. For those interested, I have now actually tested the btrfs send / btrfs receive backup for several subvolumes after applying this patch. The throughput is finally usable, almost hitting network / IO limits as expected - ideal so far! Also delta seemed fine for the subvolumes for which things worked. However, I now sadly get (for one of my subvolumes): send ioctl failed with -2: No such file or directory at some point during the transfer, it sadly seems to be reproducible. I do not think it's related to this patch, but of course this makes "btrfs send" still unusable to me - I guess it's not ready for general use just yet. Is there any information I can easily extract / provide to allow the experts to fix this issue? Did you get the half way send stream? If the send stream has something, please use "--no-data" option to send the subvolume again to get the metadata only dump, and upload it for debug. Also, please paste "btrfs-debug-tree -t " output for debug. WARN: above "btrfs-debug-tree" command will contain file names. You could use the following sed to wipe filename: "btrfs-debug-tree -t 5 /dev/sda6 | sed "s/name:.*//" Thanks, Qu The kernel log shows nothing. Thanks a lot, Oliver -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs: let btrfs_delete_unused_bgs() to clean relocated bgs
2016-09-02 (金) の 09:35 -0400 に Josef Bacik さんは書きました: > On 09/02/2016 03:46 AM, Naohiro Aota wrote: > > > > Currently, btrfs_relocate_chunk() is removing relocated BG by > > itself. But > > the work can be done by btrfs_delete_unused_bgs() (and it's better > > since it > > trim the BG). Let's dedupe the code. > > > > While btrfs_delete_unused_bgs() is already hitting the relocated > > BG, it > > skip the BG since the BG has "ro" flag set (to keep balancing BG > > intact). > > On the other hand, btrfs cannot drop "ro" flag here to prevent > > additional > > writes. So this patch make use of "removed" flag. > > btrfs_delete_unused_bgs() now detect the flag to distinguish > > whether a > > read-only BG is relocating or not. > > > > This seems racey to me. We remove the last part of the block group, > it ends up > on the unused_bgs_list, we process this list, see that removed isn't > set and we > skip it, then later we set removed, but it's too late. I think the > right way is > to actually do a transaction, set ->removed, manually add it to the > unused_bgs_list if it's not already, then end the transaction. This > way we are > guaranteed to have the bg on the list when it is ready to be > removed. This is > my analysis after looking at it for 10 seconds after being awake for > like 30 > minutes so if I'm missing something let me know. Thanks, I don't think a race will happen. Since we are holding delete_unused_bgs_mutex here, btrfs_delte_unused_bgs() checks ->removed flag after we unlock the mutex i.e. we setup the flag properly. For a case btrfs_delete_usused_bgs() checks the BG before we hold delte_unused_bgs_mutex, then that BG is removed by it (if it's empty) and btrfs_relocate_chunk() should never see it. Regards, Naohiro
RE: [PATCH] Btrfs: remove unnecessary code of chunk_root assignment in btrfs_read_chunk_tree.
Hi, Sean Fu > From: Sean Fu [mailto:fxinr...@gmail.com] > Sent: Sunday, September 04, 2016 7:54 PM > To: dste...@suse.com > Cc: c...@fb.com; anand.j...@oracle.com; fdman...@suse.com; > zhao...@cn.fujitsu.com; linux-btrfs@vger.kernel.org; > linux-ker...@vger.kernel.org; Sean Fu> Subject: [PATCH] Btrfs: remove unnecessary code of chunk_root assignment in > btrfs_read_chunk_tree. > > The input argument root is already set with "fs_info->chunk_root". > "chunk_root = fs_info->chunk_root = btrfs_alloc_root(fs_info)" in caller > "open_ctree". > “root->fs_info = fs_info” in "btrfs_alloc_root". > The root argument of this function means "any root". And the function is designed getting chunk root from "any root" in head. Since there is only one caller of this function, and the caller always send chunk_root as root argument in current code, we can remove above conversion, and I suggest renaming root to chunk_root to make it clear, something like: - btrfs_read_chunk_tree(struct btrfs_root *root) + btrfs_read_chunk_tree(struct btrfs_root *chunk_root) Thanks Zhaolei > Signed-off-by: Sean Fu > --- > fs/btrfs/volumes.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index 366b335..384a6d2 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -6600,8 +6600,6 @@ int btrfs_read_chunk_tree(struct btrfs_root *root) > int ret; > int slot; > > - root = root->fs_info->chunk_root; > - > path = btrfs_alloc_path(); > if (!path) > return -ENOMEM; > -- > 2.6.2 > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: gazillions of Incorrect local/global backref count
On Sat, Sep 3, 2016 at 10:50 PM, Christoph Anton Mittererwrote: > Hey. > > I just did a btrfs check on my notebooks root fs, with: > $ uname -a > Linux heisenberg 4.7.0-1-amd64 #1 SMP Debian 4.7.2-1 (2016-08-28) > x86_64 GNU/Linux > $ btrfs --version > btrfs-progs v4.7.1 > > > > during: > checking extents > > it found gazillions of these: > Incorrect local backref count on 1107980288 root 257 owner 17807428 > offset 13568135168 found 2 wanted 3 back 0x2d69990 > Incorrect local backref count on 1107980288 root 257 owner 14055042 > offset 13568135168 found 2 wanted 3 back 0x2d69930 > Incorrect global backref count on 1107980288 found 4 wanted 6 > backpointer mismatch on [1107980288 61440] > Incorrect local backref count on 1108049920 root 257 owner 17807428 > offset 13568262144 found 2 wanted 5 back 0x2d69ac0 > Incorrect local backref count on 1108049920 root 257 owner 14055042 > offset 13568262144 found 2 wanted 5 back 0x2d69b20 > Incorrect global backref count on 1108049920 found 4 wanted 10 > backpointer mismatch on [1108049920 77824] > > See stdout/err[0] logfiles from the check. > > > What do they mean? https://bugzilla.kernel.org/show_bug.cgi?id=155791 http://www.spinics.net/lists/linux-btrfs/msg58142.html -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re[4]: btrfs check "Couldn't open file system" after error in transaction.c
On Sun, Sep 4, 2016 at 1:23 PM, Hendrik Friedelwrote: > Hello, > > here the output of btrfsck: > Checking filesystem on /dev/sdd > UUID: a8af3832-48c7-4568-861f-e80380dd7e0b > checking extents > checking free space cache > checking fs root > checking csums > checking root refs > checking quota groups > Ignoring qgroup relation key 24544 > Ignoring qgroup relation key 24610 > Ignoring qgroup relation key 24611 > Ignoring qgroup relation key 25933 > Ignoring qgroup relation key 25934 > Ignoring qgroup relation key 25935 > Ignoring qgroup relation key 25936 > Ignoring qgroup relation key 25937 > Ignoring qgroup relation key 25938 > Ignoring qgroup relation key 25939 > Ignoring qgroup relation key 25939 > Ignoring qgroup relation key 25941 > Ignoring qgroup relation key 25942 > Ignoring qgroup relation key 25958 > Ignoring qgroup relation key 25959 > Ignoring qgroup relation key 25960 > Ignoring qgroup relation key 25961 > Ignoring qgroup relation key 25962 > Ignoring qgroup relation key 25963 > Ignoring qgroup relation key 25964 > Ignoring qgroup relation key 25965 > Ignoring qgroup relation key 25966 > Ignoring qgroup relation key 25966 > Ignoring qgroup relation key 25968 > Ignoring qgroup relation key 25970 > Ignoring qgroup relation key 25971 > Ignoring qgroup relation key 25972 > Ignoring qgroup relation key 25975 > Ignoring qgroup relation key 25976 > Ignoring qgroup relation key 25976 > Ignoring qgroup relation key 25976 > Ignoring qgroup relation key 567172078071971871 > Ignoring qgroup relation key 567172078071971872 > Ignoring qgroup relation key 567172078071971882 > Ignoring qgroup relation key 567172078071971885 > Ignoring qgroup relation key 567172078071971885 > Ignoring qgroup relation key 567172078071971885 > Ignoring qgroup relation key 567172078071971885 > Ignoring qgroup relation key 567172078071971885 > Ignoring qgroup relation key 567172078071971885 > Ignoring qgroup relation key 567172078071971885 > Ignoring qgroup relation key 567172078071971885 > Ignoring qgroup relation key 567172078071971885 > Ignoring qgroup relation key 567172078071971885 > Ignoring qgroup relation key 567172078071971885 > Ignoring qgroup relation key 567172078071971885 > Ignoring qgroup relation key 567172078071971885 > Ignoring qgroup relation key 567172078071971886 > Ignoring qgroup relation key 567172078071971886 > Ignoring qgroup relation key 567172078071971886 > Ignoring qgroup relation key 567172078071971886 > Ignoring qgroup relation key 567172078071971892 > Ignoring qgroup relation key 567172078071971892 > Ignoring qgroup relation key 567172078071971892 > Ignoring qgroup relation key 567172078071971892 > Ignoring qgroup relation key 567172078071971892 > Ignoring qgroup relation key 567172078071971892 > Ignoring qgroup relation key 567172078071971892 > Ignoring qgroup relation key 567172078071971892 > Ignoring qgroup relation key 567172078071971892 > Ignoring qgroup relation key 567172078071971892 > Ignoring qgroup relation key 567172078071971892 > Ignoring qgroup relation key 567172078071971892 > Qgroup is already inconsistent before checking > Counts for qgroup id: 3102 are different > our:referenced 174829252608 referenced compressed 174829252608 > disk: referenced 174829252608 referenced compressed 174829252608 > our:exclusive 2899968 exclusive compressed 2899968 > disk: exclusive 2916352 exclusive compressed 2916352 > diff: exclusive -16384 exclusive compressed -16384 > Counts for qgroup id: 25977 are different > our:referenced 47249391616 referenced compressed 47249391616 > disk: referenced 47249391616 referenced compressed 47249391616 > our:exclusive 90222592 exclusive compressed 90222592 > disk: exclusive 90238976 exclusive compressed 90238976 > diff: exclusive -16384 exclusive compressed -16384 > Counts for qgroup id: 25978 are different > our:referenced 174829252608 referenced compressed 174829252608 > disk: referenced 174829252608 referenced compressed 174829252608 > our:exclusive 1064960 exclusive compressed 1064960 > disk: exclusive 1081344 exclusive compressed 1081344 > diff: exclusive -16384 exclusive compressed -16384 > Counts for qgroup id: 26162 are different > our:referenced 65940500480 referenced compressed 65940500480 > disk: referenced 65866997760 referenced compressed 65866997760 > diff: referenced 73502720 referenced compressed 73502720 > our:exclusive 3991326720 exclusive compressed 3991326720 > disk: exclusive 3960582144 exclusive compressed 3960582144 > diff: exclusive 30744576 exclusive compressed 30744576 > found 8423479726080 bytes used err is 1 > total csum bytes: 8206766844 > total tree bytes: 17669144576 > total fs tree bytes: 7271251968 > total extent tree bytes: 683851776 > total csum bytes: 8206766844 > total tree bytes:
Re: Re[3]: btrfs check "Couldn't open file system" after error in transaction.c
On Sun, Sep 4, 2016 at 12:51 PM, Hendrik Friedelwrote: > Hello again, > > before overwriting the filesystem, some last questions: > >>> Maybe >>> take advantage of the fact it does read only and recreate it. You >>> could take a btrfs-image and btrfs-debug-tree first, >> >> And what do I do with it? >> >>> because there's >>> some bug somewhere: somehow it became inconsistent, and can't be fixed >>> at mount time or even with btrfs check. >> >> Ok, so is there any way to help you finding this bug? > > Anything, I can do here? > >> Coming back to my objectives: >> -Understand the reason behind the issue and prevent it in future >> Finding the but would help on the above >> >> -If not possible to repair the filesystem: >>-understand if the data that I read from the drive is valid or >> corrupted >> Can you answer this? >> >> As mentioned: I do have a backup, a month old. The data does not change so >> regularly, so most should be ok. >> Now I have two sources of data: >> the backup and the current degraded filesystem. >> If data differs, which one do I take? Is it safe to use the more recent >> one from the degraded filesystem? >> > And can you help me on these points? > > FYI, I did a > btrfsck --init-csum-tree /dev/sdd > btrfs rescue zero-log btrfs-zero-log > btrfsck /dev/sdd Curious that this is fixing a parenttransid problem...not sure why. Only a developer working on btrfsck could answer this. They'd need the btrfs-image before these things were done and see what's wrong with the file system that causes check to fail. Changing anything changes the evidence of what was wrong. > > now. The last command is still running. It seems to be working; Is there a > way to be sure, that the data is all ok again? Not by Brfs. The problem now is that by init-csum-tree it recomputed the csums for everything. If there were any files corrupt, they now have csums based on that corruption, so they will read as OK by Btrfs. That's the problem with init-csum-tree. So now you need a different way to confirm/deny if they files are really good or not. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re[2]: btrfs check "Couldn't open file system" after error in transaction.c
Lost track of this...sorry. On Sun, Aug 28, 2016 at 12:04 PM, Hendrik Friedelwrote: > Hi Chris, > > thanks for your reply -especially on a Sunday. >>> >>> I have a filesystem (three disks with no raid) >> >> >> So it's data single *and* metadata single? >> > No: > Data, single: total=8.14TiB, used=7.64TiB > System, RAID1: total=32.00MiB, used=912.00KiB > Metadata, RAID1: total=18.00GiB, used=16.45GiB > GlobalReserve, single: total=512.00MiB, used=0.00B >> >> >> >>> btrfs check will lead to "Couldn't open file system" That's a bug worth filing. That bug report will need a URL for where you put the btrfs-image file. >> Maybe >> take advantage of the fact it does read only and recreate it. You >> could take a btrfs-image and btrfs-debug-tree first, > > And what do I do with it? Put it somewhere it can live a while, it might be months before a dev gets around to looking at it. I usually put them on google drive in the public folder, and then post the URL (get shareable link) in the bug report. > >> because there's >> some bug somewhere: somehow it became inconsistent, and can't be fixed >> at mount time or even with btrfs check. > > Ok, so is there any way to help you finding this bug? > Coming back to my objectives: > -Understand the reason behind the issue and prevent it in future > Finding the but would help on the above No idea. > > -If not possible to repair the filesystem: >-understand if the data that I read from the drive is valid or corrupted > Can you answer this? Other than nocow files which do not have csums, Btrfs will spit back an I/O error and path to the bad file rather than hand over data it thinks is corrupt (doesn't match csum). So data read from the volume should be valid. > > As mentioned: I do have a backup, a month old. The data does not change so > regularly, so most should be ok. > Now I have two sources of data: > the backup and the current degraded filesystem. > If data differs, which one do I take? Is it safe to use the more recent one > from the degraded filesystem? If data differs you have to figure out a way to inspect the file to determine which one is correct. Databases have their own consistency checks, for example, if it's an image, open it in a viewer - big problems will be visible, small problems might just be one wrong pixel and you may not even notice it. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs send extremely slow (almost stuck)
Am 30.08.2016 um 02:48 schrieb Qu Wenruo: > Yes. > And more specifically, it doesn't even affect delta backup. > > For shared extents caused by reflink/dedupe(out-of-band or even incoming > in-band), it will be send as individual files. > > For contents, they are all the same, just more space usage. For those interested, I have now actually tested the btrfs send / btrfs receive backup for several subvolumes after applying this patch. The throughput is finally usable, almost hitting network / IO limits as expected - ideal so far! Also delta seemed fine for the subvolumes for which things worked. However, I now sadly get (for one of my subvolumes): send ioctl failed with -2: No such file or directory at some point during the transfer, it sadly seems to be reproducible. I do not think it's related to this patch, but of course this makes "btrfs send" still unusable to me - I guess it's not ready for general use just yet. Is there any information I can easily extract / provide to allow the experts to fix this issue? The kernel log shows nothing. Thanks a lot, Oliver -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re[4]: btrfs check "Couldn't open file system" after error in transaction.c
Hello, here the output of btrfsck: Checking filesystem on /dev/sdd UUID: a8af3832-48c7-4568-861f-e80380dd7e0b checking extents checking free space cache checking fs root checking csums checking root refs checking quota groups Ignoring qgroup relation key 24544 Ignoring qgroup relation key 24610 Ignoring qgroup relation key 24611 Ignoring qgroup relation key 25933 Ignoring qgroup relation key 25934 Ignoring qgroup relation key 25935 Ignoring qgroup relation key 25936 Ignoring qgroup relation key 25937 Ignoring qgroup relation key 25938 Ignoring qgroup relation key 25939 Ignoring qgroup relation key 25939 Ignoring qgroup relation key 25941 Ignoring qgroup relation key 25942 Ignoring qgroup relation key 25958 Ignoring qgroup relation key 25959 Ignoring qgroup relation key 25960 Ignoring qgroup relation key 25961 Ignoring qgroup relation key 25962 Ignoring qgroup relation key 25963 Ignoring qgroup relation key 25964 Ignoring qgroup relation key 25965 Ignoring qgroup relation key 25966 Ignoring qgroup relation key 25966 Ignoring qgroup relation key 25968 Ignoring qgroup relation key 25970 Ignoring qgroup relation key 25971 Ignoring qgroup relation key 25972 Ignoring qgroup relation key 25975 Ignoring qgroup relation key 25976 Ignoring qgroup relation key 25976 Ignoring qgroup relation key 25976 Ignoring qgroup relation key 567172078071971871 Ignoring qgroup relation key 567172078071971872 Ignoring qgroup relation key 567172078071971882 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971885 Ignoring qgroup relation key 567172078071971886 Ignoring qgroup relation key 567172078071971886 Ignoring qgroup relation key 567172078071971886 Ignoring qgroup relation key 567172078071971886 Ignoring qgroup relation key 567172078071971892 Ignoring qgroup relation key 567172078071971892 Ignoring qgroup relation key 567172078071971892 Ignoring qgroup relation key 567172078071971892 Ignoring qgroup relation key 567172078071971892 Ignoring qgroup relation key 567172078071971892 Ignoring qgroup relation key 567172078071971892 Ignoring qgroup relation key 567172078071971892 Ignoring qgroup relation key 567172078071971892 Ignoring qgroup relation key 567172078071971892 Ignoring qgroup relation key 567172078071971892 Ignoring qgroup relation key 567172078071971892 Qgroup is already inconsistent before checking Counts for qgroup id: 3102 are different our:referenced 174829252608 referenced compressed 174829252608 disk: referenced 174829252608 referenced compressed 174829252608 our:exclusive 2899968 exclusive compressed 2899968 disk: exclusive 2916352 exclusive compressed 2916352 diff: exclusive -16384 exclusive compressed -16384 Counts for qgroup id: 25977 are different our:referenced 47249391616 referenced compressed 47249391616 disk: referenced 47249391616 referenced compressed 47249391616 our:exclusive 90222592 exclusive compressed 90222592 disk: exclusive 90238976 exclusive compressed 90238976 diff: exclusive -16384 exclusive compressed -16384 Counts for qgroup id: 25978 are different our:referenced 174829252608 referenced compressed 174829252608 disk: referenced 174829252608 referenced compressed 174829252608 our:exclusive 1064960 exclusive compressed 1064960 disk: exclusive 1081344 exclusive compressed 1081344 diff: exclusive -16384 exclusive compressed -16384 Counts for qgroup id: 26162 are different our:referenced 65940500480 referenced compressed 65940500480 disk: referenced 65866997760 referenced compressed 65866997760 diff: referenced 73502720 referenced compressed 73502720 our:exclusive 3991326720 exclusive compressed 3991326720 disk: exclusive 3960582144 exclusive compressed 3960582144 diff: exclusive 30744576 exclusive compressed 30744576 found 8423479726080 bytes used err is 1 total csum bytes: 8206766844 total tree bytes: 17669144576 total fs tree bytes: 7271251968 total extent tree bytes: 683851776 total csum bytes: 8206766844 total tree bytes: 17669144576 total fs tree bytes: 7271251968 total extent tree bytes: 683851776 btree space waste bytes: 2859469730 file data blocks allocated: 16171232772096 referenced 13512171663360 What does that tell us? Greetings, Hendrik -- Originalnachricht -- Von: "Hendrik Friedel"
Re[3]: btrfs check "Couldn't open file system" after error in transaction.c
Hello again, before overwriting the filesystem, some last questions: Maybe take advantage of the fact it does read only and recreate it. You could take a btrfs-image and btrfs-debug-tree first, And what do I do with it? because there's some bug somewhere: somehow it became inconsistent, and can't be fixed at mount time or even with btrfs check. Ok, so is there any way to help you finding this bug? Anything, I can do here? Coming back to my objectives: -Understand the reason behind the issue and prevent it in future Finding the but would help on the above -If not possible to repair the filesystem: -understand if the data that I read from the drive is valid or corrupted Can you answer this? As mentioned: I do have a backup, a month old. The data does not change so regularly, so most should be ok. Now I have two sources of data: the backup and the current degraded filesystem. If data differs, which one do I take? Is it safe to use the more recent one from the degraded filesystem? And can you help me on these points? FYI, I did a btrfsck --init-csum-tree /dev/sdd btrfs rescue zero-log btrfs-zero-log btrfsck /dev/sdd now. The last command is still running. It seems to be working; Is there a way to be sure, that the data is all ok again? Regards, Hendrik Greetings, Hendrik -- Chris Murphy --- Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: OOM killer and Btrfs
On Sun, Sep 4, 2016, at 12:06, Markus Trippelsdorf wrote: > On 2016.09.04 at 11:59 +0200, Francesco Turco wrote: > > Is the problem already known? Should I report a bug? Is there a patch I > > can try? Thanks. > > This issue was recently fixed by: > > commit 6b4e3181d7bd5ca5ab6f45929e4a5ffa7ab4ab7f > Author: Michal Hocko> Date: Thu Sep 1 16:14:41 2016 -0700 > > mm, oom: prevent premature OOM killer invocation for high order > request > > It will be backported to the 4.7.x stable kernel, too. Great, I will wait for a new 4.7.x release then :) Thank you for the info! -- https://www.fturco.net/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: OOM killer and Btrfs
On 2016.09.04 at 11:59 +0200, Francesco Turco wrote: > I use Btrfs on a Gentoo Linux system with kernel 4.7.2. When my computer > is under heavy I/O load some application often crashes, for example > ClamAV, Firefox or Portage. I suspect the problem is due to Btrfs, but I > may be wrong. > > These are the most recent error messages from journalctl, but I have > many other similar ones in my logs: > > *** BEGIN *** > > Sep 04 10:13:26 desktop kernel: gpg-agent invoked oom-killer: > gfp_mask=0x27080c0(GFP_KERNEL_ACCOUNT|__GFP_ZERO|__GFP_NOTRACK), > order=2, oom_ > > Is the problem already known? Should I report a bug? Is there a patch I > can try? Thanks. This issue was recently fixed by: commit 6b4e3181d7bd5ca5ab6f45929e4a5ffa7ab4ab7f Author: Michal HockoDate: Thu Sep 1 16:14:41 2016 -0700 mm, oom: prevent premature OOM killer invocation for high order request It will be backported to the 4.7.x stable kernel, too. -- Markus -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
OOM killer and Btrfs
I use Btrfs on a Gentoo Linux system with kernel 4.7.2. When my computer is under heavy I/O load some application often crashes, for example ClamAV, Firefox or Portage. I suspect the problem is due to Btrfs, but I may be wrong. These are the most recent error messages from journalctl, but I have many other similar ones in my logs: *** BEGIN *** Sep 04 10:13:26 desktop kernel: gpg-agent invoked oom-killer: gfp_mask=0x27080c0(GFP_KERNEL_ACCOUNT|__GFP_ZERO|__GFP_NOTRACK), order=2, oom_ Sep 04 10:13:26 desktop kernel: gpg-agent cpuset=/ mems_allowed=0 Sep 04 10:13:26 desktop kernel: CPU: 1 PID: 15883 Comm: gpg-agent Not tainted 4.7.2-gentoo #6 Sep 04 10:13:26 desktop kernel: Hardware name: /DQ35JO, BIOS JOQ3510J.86A.1143.2010.1209.0048 12/09/2010 Sep 04 10:13:26 desktop kernel: 8801258ebbb0 813db638 8801258ebd48 Sep 04 10:13:26 desktop kernel: 88009510c800 8801258ebbe8 811bbe3d 8801258ebd48 Sep 04 10:13:26 desktop kernel: 88009510c800 81e30816 001c Sep 04 10:13:26 desktop kernel: Call Trace: Sep 04 10:13:26 desktop kernel: [] dump_stack+0x4d/0x65 Sep 04 10:13:26 desktop kernel: [] dump_header+0x56/0x16e Sep 04 10:13:26 desktop kernel: [] oom_kill_process+0x218/0x3e0 Sep 04 10:13:26 desktop kernel: [] out_of_memory+0x3ba/0x460 Sep 04 10:13:26 desktop kernel: [] __alloc_pages_nodemask+0xedd/0xf00 Sep 04 10:13:26 desktop kernel: [] alloc_kmem_pages_node+0x4a/0xc0 Sep 04 10:13:26 desktop kernel: [] copy_process.part.50+0x104/0x1760 Sep 04 10:13:26 desktop kernel: [] ? check_preempt_wakeup+0x10a/0x240 Sep 04 10:13:26 desktop kernel: [] ? __set_task_blocked+0x2d/0x70 Sep 04 10:13:26 desktop kernel: [] _do_fork+0xc5/0x370 Sep 04 10:13:26 desktop kernel: [] ? SyS_pselect6+0x13a/0x220 Sep 04 10:13:26 desktop kernel: [] SyS_clone+0x14/0x20 Sep 04 10:13:26 desktop kernel: [] do_syscall_64+0x4b/0xa0 Sep 04 10:13:26 desktop kernel: [] entry_SYSCALL64_slow_path+0x25/0x25 Sep 04 10:13:26 desktop kernel: Mem-Info: Sep 04 10:13:26 desktop kernel: active_anon:173869 inactive_anon:274253 isolated_anon:0 active_file:888485 inactive_file:366424 isolated_file:0 unevictable:8 dirty:231 writeback:0 unstable:0 slab_reclaimable:240788 slab_unreclaimable:10484 mapped:46080 shmem:2372 pagetables:8521 bounce:0 free:36342 free_pcp:0 free_cma:0 Sep 04 10:13:26 desktop kernel: Node 0 DMA free:15768kB min:20kB low:32kB high:44kB active_anon:0kB inactive_anon:0kB active_file:0kB inacti Sep 04 10:13:26 desktop kernel: lowmem_reserve[]: 0 3219 7890 7890 Sep 04 10:13:26 desktop kernel: Node 0 DMA32 free:48736kB min:4632kB low:7928kB high:11224kB active_anon:164348kB inactive_anon:560608kB act Sep 04 10:13:26 desktop kernel: lowmem_reserve[]: 0 0 4671 4671 Sep 04 10:13:26 desktop kernel: Node 0 Normal free:80864kB min:6720kB low:11500kB high:16280kB active_anon:531128kB inactive_anon:536404kB a Sep 04 10:13:26 desktop kernel: lowmem_reserve[]: 0 0 0 0 Sep 04 10:13:26 desktop kernel: Node 0 DMA: 2*4kB (U) 2*8kB (U) 2*16kB (U) 1*32kB (U) 1*64kB (U) 0*128kB 1*256kB (U) 0*512kB 1*1024kB (U) 1* Sep 04 10:13:26 desktop kernel: Node 0 DMA32: 7240*4kB (UME) 2472*8kB (UME) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0 Sep 04 10:13:26 desktop kernel: Node 0 Normal: 19846*4kB (UMEH) 29*8kB (UH) 12*16kB (H) 7*32kB (H) 0*64kB 1*128kB (H) 3*256kB (H) 0*512kB 0* Sep 04 10:13:26 desktop kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Sep 04 10:13:26 desktop kernel: 1261746 total pagecache pages Sep 04 10:13:26 desktop kernel: 4498 pages in swap cache Sep 04 10:13:26 desktop kernel: Swap cache stats: add 927111, delete 922613, find 398281/626731 Sep 04 10:13:26 desktop kernel: Free swap = 8024748kB Sep 04 10:13:26 desktop kernel: Total swap = 8388604kB Sep 04 10:13:26 desktop kernel: 2079412 pages RAM Sep 04 10:13:26 desktop kernel: 0 pages HighMem/MovableOnly Sep 04 10:13:26 desktop kernel: 53317 pages reserved Sep 04 10:13:26 desktop kernel: [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name Sep 04 10:13:26 desktop kernel: [ 1655] 0 16554624610346 88 3 79 0 systemd-journal Sep 04 10:13:26 desktop kernel: [ 1665] 0 166525033 145 16 3 60 0 lvmetad Sep 04 10:13:26 desktop kernel: [ 1686] 0 1686 8671 251 19 3 469 -1000 systemd-udevd Sep 04 10:13:26 desktop kernel: [ 1805] 108 180528434 243 26 4 101 0 systemd-timesyn Sep 04 10:13:26 desktop kernel: [ 1813] 109 1813 9863 1398 24
Re: gazillions of Incorrect local/global backref count
On Sun, 2016-09-04 at 05:33 +, Paul Jones wrote: > The errors are wrong. I nearly ruined my filesystem a few days ago by > trying to repair similar errors, thankfully all seems ok. > Check again with btrfs-progs 4.6.1 and see if the errors go away, > mine did. > See open bug https://bugzilla.kernel.org/show_bug.cgi?id=155791 for > more details. Thanks for the pointer :) I can at least confirm that my system seems to work normal, scrub didn't bring any errors either, nor are there any kernel messages... The interesting thing... I have some pretty large btrfs on those 8TiB seagate disks (nearly full with some million files)... which I have also scanned with v4.7... and no errors. Only my system fs seems to be "affected". Well it's not my first case of false positives in btrfs check (https:// www.mail-archive.com/linux-btrfs@vger.kernel.org/msg48325.html)... so I was more relaxed this time (at least a bit ;-) ). Cheers, Chris. smime.p7s Description: S/MIME cryptographic signature