Re: Hand Patching a BTRFS Superblock?
On 2017年12月30日 03:30, Stirling Westrup wrote: > You were right! grep found two more signature blocks! How do I make use of > them? > > videon:~ # LC_ALL=C grep -obUaP "\x5F\x42\x48\x52\x66\x53\x5F\x4D" /dev/sde > 65600:_BHRfS_M This the correct one. Offset is 64K + 64. > 26697111807:_BHRfS_M It is a little tricky now. Btrfs has its superblocks at: 64K (primary) 64M (backup 1) 256G (backup 2) While this one is at 25G and has offset which is not 64 (magic inside superblock). Is there any btrfs image inside the fs? > 26854350428:_BHRfS_M Much like the previous one. Despite that, you could try to use "inspect dump-super --bytenr" to check if it's the super you want. The bytenr you could pass is: 26697111743 26854350364 And at this point, I would say the chance to recover data is really very low now. Thanks, Qu > > On Thu, Dec 28, 2017 at 11:00 PM, Qu Wenruowrote: >> >> >> On 2017年12月29日 11:35, Stirling Westrup wrote: >>> On Thu, Dec 28, 2017 at 9:08 PM, Qu Wenruo wrote: >>> I strongly recommend to do a binary search for magic number "5f42 4852 6653 5f4d" to locate the real offset (if it's offset, not a toasted image) >>> I don't understand, how would I do a binary search for that signature? >>> >> The most stupid idea is to use xxd and grep. >> >> Something like: >> >> # xxd /dev/sde | grep 5f42 -C1 >> > > > signature.asc Description: OpenPGP digital signature
Re: btrfs balance problems
On 12/28/2017 12:15 PM, Nikolay Borisov wrote: > > On 23.12.2017 13:19, James Courtier-Dutton wrote: >> >> During a btrfs balance, the process hogs all CPU. >> Or, to be exact, any other program that wishes to use the SSD during a >> btrfs balance is blocked for long periods. Long periods being more >> than 5 seconds. >> Is there any way to multiplex SSD access while btrfs balance is >> operating, so that other applications can still access the SSD with >> relatively low latency? >> >> My guess is that btrfs is doing a transaction with a large number of >> SSD blocks at a time, and thus blocking other applications. >> >> This makes for atrocious user interactivity as well as applications >> failing because they cannot access the disk in a relatively low latent >> manner. >> For, example, this is causing a High Definition network CCTV >> application to fail. >> >> What I would really like, is for some way to limit SSD bandwidths to >> applications. >> For example the CCTV app always gets the bandwidth it needs, and all >> other applications can still access the SSD, but are rate limited. >> This would fix my particular problem. >> We have rate limiting for network applications, why not disk access also? > > So how are you running btrfs balance? Or, to again take one step further back... *Why* are you running btrfs balance at all? :) > Are you using any filters > whatsoever? The documentation > [https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs-balance] has the > following warning: > > Warning: running balance without filters will take a lot of time as it > basically rewrites the entire filesystem and needs to update all block > pointers. -- Hans van Kranenburg -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: ATTENTION!!!
From: Loretta Robles Sent: Friday, December 29, 2017 1:01 PM To: Loretta Robles Subject: ATTENTION!!! You have been randomly selected for a donation. Contact soriz4...@gmail.com for claims. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs balance problems
Am Thu, 28 Dec 2017 00:39:37 + schrieb Duncan: >> I can I get btrfs balance to work in the background, without adversely >> affecting other applications? > > I'd actually suggest a different strategy. > > What I did here way back when I was still on reiserfs on spinning rust, > where it made more difference than on ssd, but I kept the settings when > I switched to ssd and btrfs, and at least some others have mentioned > that similar settings helped them on btrfs as well, is... > > Problem: The kernel virtual-memory subsystem's writeback cache was > originally configured for systems with well under a Gigabyte of RAM, and > the defaults no longer work so well on multi-GiB-RAM systems, > particularly above 8 GiB RAM, because they are based on a percentage of > available RAM, and will typically let several GiB of dirty writeback > cache accumulate before kicking off any attempt to actually write it to > storage. On spinning rust, when writeback /does/ finally kickoff, this > can result in hogging the IO for well over half a minute at a time, > where 30 seconds also happens to be the default "flush it anyway" time. This is somehow like the buffer bloat discussion for networking... Big buffers increase latency. There is more than one type of buffer. Additionally to what Duncan wrote (first type of buffer), the kernel lately got a new option to fight this "buffer bloat": writeback- throttling. It may help to enable that option. The second type of buffer is the io queue. So, you may also want to lower the io queue depth (nr_requests) of your devices. I think it defaults to 128 while most consumer drives only have a queue depth of 31 or 32 commands. Thus, reducing nr_requests for some of your devices may help you achieve better latency (but reduces throughput). Especially if working with io schedulers that do not implement io priorities, you could simply lower nr_requests to around or below the native command queue depth of your devices. The device itself can handle it better in that case, especially on spinning rust, as the firmware knows when to pull certain selected commands from the queue during a rotation of the media. The kernel knows nothing about rotary positions, it can only use the queue to prioritize and reorder requests but cannot take advantage of rotary positions of the heads. See $ grep ^ /sys/block/*/queue/nr_requests You may also get better results with increasing the nr_requests instead but at the cost of also adjusting the write buffer sizes, because with large nr_requests, you don't want blocking on writes so early, at least not when you need good latency. This probably works best for you with schedulers that care about latency, like deadline or kyber. For testing, keep in mind that everything works in dependence to each other setting. So change one at a time, take your tests, then change another and see how that relates to the first change, even when the first change made your experience worse. Another tip that's missing: Put different access classes onto different devices. That is, if you have a directory structure that's mostly written to, put it on it's own physical devices, with separate tuning and appropriate filesystem (log structured and cow filesystems are good at streaming writes). Put read mostly workloads also on their own device and filesystems. Put realtime workloads on their own device and filesystems. This gives you a much easier chance to succeed. -- Regards, Kai Replies to list-only preferred. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] Btrfs: enchanse raid1/10 balance heuristic
2017-12-29 22:14 GMT+03:00 Dmitrii Tcvetkov: > On Fri, 29 Dec 2017 21:44:19 +0300 > Dmitrii Tcvetkov wrote: >> > +/** >> > + * guess_optimal - return guessed optimal mirror >> > + * >> > + * Optimal expected to be pid % num_stripes >> > + * >> > + * That's generaly ok for spread load >> > + * Add some balancer based on queue leght to device >> > + * >> > + * Basic ideas: >> > + * - Sequential read generate low amount of request >> > + *so if load of drives are equal, use pid % num_stripes >> > balancing >> > + * - For mixed rotate/non-rotate mirrors, pick non-rotate as >> > optimal >> > + *and repick if other dev have "significant" less queue lenght >> > + * - Repick optimal if queue leght of other mirror are less >> > + */ >> > +static int guess_optimal(struct map_lookup *map, int optimal) >> > +{ >> > + int i; >> > + int round_down = 8; >> > + int num = map->num_stripes; >> >> num has to be initialized from map->sub_stripes if we're reading >> RAID10, otherwise there will be NULL pointer dereference >> > > Check can be like: > if (map->type & BTRFS_BLOCK_GROUP_RAID10) > num = map->sub_stripes; > >>@@ -5804,10 +5914,12 @@ static int __btrfs_map_block(struct >>btrfs_fs_info *fs_info, >> stripe_index += mirror_num - 1; >> else { >> int old_stripe_index = stripe_index; >>+ optimal = guess_optimal(map, >>+ current->pid % >>map->num_stripes); >> stripe_index = find_live_mirror(fs_info, map, >> stripe_index, >> map->sub_stripes, >> stripe_index + >>-current->pid % >>map->sub_stripes, >>+optimal, >> dev_replace_is_ongoing); >> mirror_num = stripe_index - old_stripe_index >> + 1; } >>-- >>2.15.1 > > Also here calculation should be with map->sub_stripes too. > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Why you think we need such check? I.e. guess_optimal always called for find_live_mirror() Both in same context, like that: if (map->type & BTRFS_BLOCK_GROUP_RAID10) { u32 factor = map->num_stripes / map->sub_stripes; stripe_nr = div_u64_rem(stripe_nr, factor, _index); stripe_index *= map->sub_stripes; if (need_full_stripe(op)) num_stripes = map->sub_stripes; else if (mirror_num) stripe_index += mirror_num - 1; else { int old_stripe_index = stripe_index; stripe_index = find_live_mirror(fs_info, map, stripe_index, map->sub_stripes, stripe_index + current->pid % map->sub_stripes, dev_replace_is_ongoing); mirror_num = stripe_index - old_stripe_index + 1; } That useless to check that internally --- Also, fio results for all hdd raid1, results from waxhead: Original: Disk-4k-randread-depth-32: (g=0): rw=randread, bs=(R) 4096B-512KiB, (W) 4096B-512KiB, (T) 4096B-512KiB, ioengine=libaio, iodepth=32 Disk-4k-read-depth-8: (g=0): rw=read, bs=(R) 4096B-512KiB, (W) 4096B-512KiB, (T) 4096B-512KiB, ioengine=libaio, iodepth=8 Disk-4k-randwrite-depth-8: (g=0): rw=randwrite, bs=(R) 4096B-512KiB, (W) 4096B-512KiB, (T) 4096B-512KiB, ioengine=libaio, iodepth=8 fio-3.1 Starting 3 processes Disk-4k-randread-depth-32: Laying out IO file (1 file / 65536MiB) Jobs: 3 (f=3): [r(1),R(1),w(1)][100.0%][r=120MiB/s,w=9.88MiB/s][r=998,w=96 IOPS][eta 00m:00s] Disk-4k-randread-depth-32: (groupid=0, jobs=1): err= 0: pid=3132: Fri Dec 29 16:16:33 2017 read: IOPS=375, BW=41.3MiB/s (43.3MB/s)(24.2GiB/600128msec) slat (usec): min=15, max=206039, avg=88.71, stdev=990.35 clat (usec): min=357, max=3487.1k, avg=85022.93, stdev=141872.25 lat (usec): min=399, max=3487.2k, avg=85112.58, stdev=141880.31 clat percentiles (msec): | 1.00th=[5], 5.00th=[7], 10.00th=[9], 20.00th=[ 13], | 30.00th=[ 19], 40.00th=[ 27], 50.00th=[ 39], 60.00th=[ 56], | 70.00th=[ 83], 80.00th=[ 127], 90.00th=[ 209], 95.00th=[ 300], | 99.00th=[ 600], 99.50th=[ 852], 99.90th=[ 1703], 99.95th=[ 2165], | 99.99th=[ 2937] bw ( KiB/s): min= 392, max=75824, per=30.46%, avg=42736.09, stdev=12019.09, samples=1186 iops: min=3, max= 500, avg=380.24, stdev=99.50, samples=1186 lat (usec) : 500=0.01%, 750=0.01%, 1000=0.01% lat (msec) : 2=0.01%, 4=0.29%, 10=12.33%, 20=19.67%, 50=24.92% lat (msec) : 100=17.51%, 250=18.05%, 500=5.72%, 750=0.85%, 1000=0.28% lat (msec) : 2000=0.29%, >=2000=0.07% cpu : usr=0.67%, sys=4.62%, ctx=215716, majf=0, minf=526 IO depths: 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%,
Re: WARNING: CPU: 1 PID: 3016 at fs/btrfs/ctree.h:1564 btrfs_update_device+0x189/0x190 [btrfs]
On 29.12.2017 20:17, Elimar Riesebieter wrote: > Thanks, > > * Nikolay Borisov[2017-12-29 19:23 +0200]: > > [...] > >> So OP: >> >> Update your btrfs-progs package to latest 4.14 and run btrfs rescue : >> >> btrfs rescue fix-device-size > > I installed btrfs-progs 4.14. Can't run > 'btrfs rescue fix-device-size /dev/sd(a|b)3'. The devices are mounted > including my root... > > How to accomplish? Then you have to resize your fs to a multiple of 4k, either up or down. > > Thanks in advance > > Elimar > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Hand Patching a BTRFS Superblock?
You were right! grep found two more signature blocks! How do I make use of them? videon:~ # LC_ALL=C grep -obUaP "\x5F\x42\x48\x52\x66\x53\x5F\x4D" /dev/sde 65600:_BHRfS_M 26697111807:_BHRfS_M 26854350428:_BHRfS_M On Thu, Dec 28, 2017 at 11:00 PM, Qu Wenruowrote: > > > On 2017年12月29日 11:35, Stirling Westrup wrote: >> On Thu, Dec 28, 2017 at 9:08 PM, Qu Wenruo wrote: >>> >>> >> >>> >>> I strongly recommend to do a binary search for magic number "5f42 4852 >>> 6653 5f4d" to locate the real offset (if it's offset, not a toasted image) >>> >> I don't understand, how would I do a binary search for that signature? >> > The most stupid idea is to use xxd and grep. > > Something like: > > # xxd /dev/sde | grep 5f42 -C1 > -- Stirling Westrup Programmer, Entrepreneur. https://www.linkedin.com/e/fpf/77228 http://www.linkedin.com/in/swestrup http://technaut.livejournal.com http://sourceforge.net/users/stirlingwestrup -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] Btrfs: enchanse raid1/10 balance heuristic
On Fri, 29 Dec 2017 21:44:19 +0300 Dmitrii Tcvetkovwrote: > > +/** > > + * guess_optimal - return guessed optimal mirror > > + * > > + * Optimal expected to be pid % num_stripes > > + * > > + * That's generaly ok for spread load > > + * Add some balancer based on queue leght to device > > + * > > + * Basic ideas: > > + * - Sequential read generate low amount of request > > + *so if load of drives are equal, use pid % num_stripes > > balancing > > + * - For mixed rotate/non-rotate mirrors, pick non-rotate as > > optimal > > + *and repick if other dev have "significant" less queue lenght > > + * - Repick optimal if queue leght of other mirror are less > > + */ > > +static int guess_optimal(struct map_lookup *map, int optimal) > > +{ > > + int i; > > + int round_down = 8; > > + int num = map->num_stripes; > > num has to be initialized from map->sub_stripes if we're reading > RAID10, otherwise there will be NULL pointer dereference > Check can be like: if (map->type & BTRFS_BLOCK_GROUP_RAID10) num = map->sub_stripes; >@@ -5804,10 +5914,12 @@ static int __btrfs_map_block(struct >btrfs_fs_info *fs_info, > stripe_index += mirror_num - 1; > else { > int old_stripe_index = stripe_index; >+ optimal = guess_optimal(map, >+ current->pid % >map->num_stripes); > stripe_index = find_live_mirror(fs_info, map, > stripe_index, > map->sub_stripes, > stripe_index + >-current->pid % >map->sub_stripes, >+optimal, > dev_replace_is_ongoing); > mirror_num = stripe_index - old_stripe_index > + 1; } >-- >2.15.1 Also here calculation should be with map->sub_stripes too. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] Btrfs: enchanse raid1/10 balance heuristic
On Fri, 29 Dec 2017 05:09:14 +0300 Timofey Titovetswrote: > Currently btrfs raid1/10 balancer balance requests to mirrors, > based on pid % num of mirrors. > > Make logic understood: > - if one of underline devices are non rotational > - Queue leght to underline devices > > By default try use pid % num_mirrors guessing, but: > - If one of mirrors are non rotational, repick optimal to it > - If underline mirror have less queue leght then optimal, >repick to that mirror > > For avoid round-robin request balancing, > lets round down queue leght: > - By 8 for rotational devs > - By 2 for all non rotational devs > > Changes: > v1 -> v2: > - Use helper part_in_flight() from genhd.c > to get queue lenght > - Move guess code to guess_optimal() > - Change balancer logic, try use pid % mirror by default > Make balancing on spinning rust if one of underline devices > are overloaded > > Signed-off-by: Timofey Titovets > --- > block/genhd.c | 1 + > fs/btrfs/volumes.c | 116 > - 2 files > changed, 115 insertions(+), 2 deletions(-) > > diff --git a/block/genhd.c b/block/genhd.c > index 96a66f671720..a77426a7 100644 > --- a/block/genhd.c > +++ b/block/genhd.c > @@ -81,6 +81,7 @@ void part_in_flight(struct request_queue *q, struct > hd_struct *part, atomic_read(>in_flight[1]); > } > } > +EXPORT_SYMBOL_GPL(part_in_flight); > > struct hd_struct *__disk_get_part(struct gendisk *disk, int partno) > { > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index 9a04245003ab..1c84534df9a5 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -27,6 +27,7 @@ > #include > #include > #include > +#include > #include > #include "ctree.h" > #include "extent_map.h" > @@ -5216,6 +5217,112 @@ int btrfs_is_parity_mirror(struct > btrfs_fs_info *fs_info, u64 logical, u64 len) return ret; > } > > +/** > + * bdev_get_queue_len - return rounded down in flight queue lenght > of bdev > + * > + * @bdev: target bdev > + * @round_down: round factor big for hdd and small for ssd, like 8 > and 2 > + */ > +static int bdev_get_queue_len(struct block_device *bdev, int > round_down) +{ > + int sum; > + struct hd_struct *bd_part = bdev->bd_part; > + struct request_queue *rq = bdev_get_queue(bdev); > + uint32_t inflight[2] = {0, 0}; > + > + part_in_flight(rq, bd_part, inflight); > + > + sum = max_t(uint32_t, inflight[0], inflight[1]); > + > + /* > + * Try prevent switch for every sneeze > + * By roundup output num by some value > + */ > + return ALIGN_DOWN(sum, round_down); > +} > + > +/** > + * guess_optimal - return guessed optimal mirror > + * > + * Optimal expected to be pid % num_stripes > + * > + * That's generaly ok for spread load > + * Add some balancer based on queue leght to device > + * > + * Basic ideas: > + * - Sequential read generate low amount of request > + *so if load of drives are equal, use pid % num_stripes balancing > + * - For mixed rotate/non-rotate mirrors, pick non-rotate as optimal > + *and repick if other dev have "significant" less queue lenght > + * - Repick optimal if queue leght of other mirror are less > + */ > +static int guess_optimal(struct map_lookup *map, int optimal) > +{ > + int i; > + int round_down = 8; > + int num = map->num_stripes; num has to be initialized from map->sub_stripes if we're reading RAID10, otherwise there will be NULL pointer dereference > + int qlen[num]; > + bool is_nonrot[num]; > + bool all_bdev_nonrot = true; > + bool all_bdev_rotate = true; > + struct block_device *bdev; > + > + if (num == 1) > + return optimal; > + > + /* Check accessible bdevs */ > + for (i = 0; i < num; i++) { > + /* Init for missing bdevs */ > + is_nonrot[i] = false; > + qlen[i] = INT_MAX; > + bdev = map->stripes[i].dev->bdev; > + if (bdev) { > + qlen[i] = 0; > + is_nonrot[i] = > blk_queue_nonrot(bdev_get_queue(bdev)); > + if (is_nonrot[i]) > + all_bdev_rotate = false; > + else > + all_bdev_nonrot = false; > + } > + } > + > + /* > + * Don't bother with computation > + * if only one of two bdevs are accessible > + */ > + if (num == 2 && qlen[0] != qlen[1]) { > + if (qlen[0] < qlen[1]) > + return 0; > + else > + return 1; > + } > + > + if (all_bdev_nonrot) > + round_down = 2; > + > + for (i = 0; i < num; i++) { > + if (qlen[i]) > + continue; > + bdev = map->stripes[i].dev->bdev; > + qlen[i] = bdev_get_queue_len(bdev, round_down); > + } > +
Re: WARNING: CPU: 1 PID: 3016 at fs/btrfs/ctree.h:1564 btrfs_update_device+0x189/0x190 [btrfs]
Thanks, * Nikolay Borisov[2017-12-29 19:23 +0200]: [...] > So OP: > > Update your btrfs-progs package to latest 4.14 and run btrfs rescue : > > btrfs rescue fix-device-size I installed btrfs-progs 4.14. Can't run 'btrfs rescue fix-device-size /dev/sd(a|b)3'. The devices are mounted including my root... How to accomplish? Thanks in advance Elimar -- Excellent day for drinking heavily. Spike the office water cooler;-) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: WARNING: CPU: 1 PID: 3016 at fs/btrfs/ctree.h:1564 btrfs_update_device+0x189/0x190 [btrfs]
On 29.12.2017 19:07, Holger Hoffstätte wrote: > > Apply the patch from https://patchwork.kernel.org/patch/9960893/ > and follow the logged instructions re. device resizing (or see > https://bugzilla.kernel.org/show_bug.cgi?id=196949 for examples). > > The patch is unfortunately not yet merged into 4.15rc, otherwise it > could be sent to 4.14-stable. > This is not the correct way to resolve the issue. Rather, Qu has sent a patch for btrfs-progs which does the correct repair. The code in question is in btrfs progs 4.14. So OP: Update your btrfs-progs package to latest 4.14 and run btrfs rescue : btrfs rescue fix-device-size > -h > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: WARNING: CPU: 1 PID: 3016 at fs/btrfs/ctree.h:1564 btrfs_update_device+0x189/0x190 [btrfs]
Apply the patch from https://patchwork.kernel.org/patch/9960893/ and follow the logged instructions re. device resizing (or see https://bugzilla.kernel.org/show_bug.cgi?id=196949 for examples). The patch is unfortunately not yet merged into 4.15rc, otherwise it could be sent to 4.14-stable. -h -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: WARNING: CPU: 1 PID: 3016 at fs/btrfs/ctree.h:1564 btrfs_update_device+0x189/0x190 [btrfs]
* Elimar Riesebieter[2017-12-29 17:33 +0100]: > Hi all, > > I get warnings as seen in attached dmesg.log. This is on 4.14.9. > 4.9.72 runs flawless so far. > > ## > Linux toy 4.14.9-toy-lxtec-amd64 #7 SMP Fri Dec 29 10:43:28 CET 2017 x86_64 > GNU/Linux > -- > btrfs-progs v4.13.3 > -- > Label: 'TOY-RAID1' uuid: 32fc4ea0-0b26-478c-9b2e-b299d6289270 > Total devices 2 FS bytes used 560.98GiB > devid1 size 3.62TiB used 566.03GiB path /dev/sda3 > devid2 size 3.62TiB used 566.03GiB path /dev/sdb3 > -- > Data, RAID1: total=563.00GiB, used=559.43GiB > System, RAID1: total=32.00MiB, used=96.00KiB > Metadata, RAID1: total=3.00GiB, used=1.55GiB > GlobalReserve, single: total=512.00MiB, used=0.00B > # > > I want to run this machine as a 24/7 server with the latest > LT-KERNEL. So please investigate. Ooops, dmesg attached now. Many thanks Elimar -- Alles, was viel bedacht wird, wird bedenklich!;-) Friedrich Nietzsche [0.00] Linux version 4.14.9-toy-lxtec-amd64 (er@toy) (gcc version 7.2.0 (Debian 7.2.0-18)) #7 SMP Fri Dec 29 10:43:28 CET 2017 [0.00] Command line: BOOT_IMAGE=/boot/vmlinuz-4.14.9-toy-lxtec-amd64 root=UUID=32fc4ea0-0b26-478c-9b2e-b299d6289270 ro [0.00] KERNEL supported cpus: [0.00] Intel GenuineIntel [0.00] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' [0.00] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' [0.00] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' [0.00] x86/fpu: Supporting XSAVE feature 0x008: 'MPX bounds registers' [0.00] x86/fpu: Supporting XSAVE feature 0x010: 'MPX CSR' [0.00] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 [0.00] x86/fpu: xstate_offset[3]: 832, xstate_sizes[3]: 64 [0.00] x86/fpu: xstate_offset[4]: 896, xstate_sizes[4]: 64 [0.00] x86/fpu: Enabled xstate features 0x1f, context size is 960 bytes, using 'compacted' format. [0.00] e820: BIOS-provided physical RAM map: [0.00] BIOS-e820: [mem 0x-0x00057fff] usable [0.00] BIOS-e820: [mem 0x00058000-0x00058fff] reserved [0.00] BIOS-e820: [mem 0x00059000-0x0009efff] usable [0.00] BIOS-e820: [mem 0x0009f000-0x0009] reserved [0.00] BIOS-e820: [mem 0x0010-0x77e90fff] usable [0.00] BIOS-e820: [mem 0x77e91000-0x77e91fff] ACPI NVS [0.00] BIOS-e820: [mem 0x77e92000-0x77edbfff] reserved [0.00] BIOS-e820: [mem 0x77edc000-0x7d09] usable [0.00] BIOS-e820: [mem 0x7d0a-0x7d42dfff] reserved [0.00] BIOS-e820: [mem 0x7d42e000-0x7d5ecfff] usable [0.00] BIOS-e820: [mem 0x7d5ed000-0x7dd90fff] ACPI NVS [0.00] BIOS-e820: [mem 0x7dd91000-0x7fe67fff] reserved [0.00] BIOS-e820: [mem 0x7fe68000-0x7fffefff] type 20 [0.00] BIOS-e820: [mem 0x7000-0x7fff] usable [0.00] BIOS-e820: [mem 0xe000-0xefff] reserved [0.00] BIOS-e820: [mem 0xfe00-0xfe010fff] reserved [0.00] BIOS-e820: [mem 0xfec0-0xfec00fff] reserved [0.00] BIOS-e820: [mem 0xfee0-0xfee00fff] reserved [0.00] BIOS-e820: [mem 0xff00-0x] reserved [0.00] BIOS-e820: [mem 0x0001-0x000877bf] usable [0.00] NX (Execute Disable) protection: active [0.00] efi: EFI v2.40 by American Megatrends [0.00] efi: ESRT=0x7fc2e918 ACPI=0x7d7f5000 ACPI 2.0=0x7d7f5000 SMBIOS=0xf05e0 SMBIOS 3.0=0x7fb79000 MPS=0xfca00 [0.00] random: fast init done [0.00] SMBIOS 3.0.0 present. [0.00] DMI: Supermicro Super Server/X11SSM-F, BIOS 2.0b 07/28/2017 [0.00] e820: update [mem 0x-0x0fff] usable ==> reserved [0.00] e820: remove [mem 0x000a-0x000f] usable [0.00] e820: last_pfn = 0x877c00 max_arch_pfn = 0x4 [0.00] MTRR default type: write-back [0.00] MTRR fixed ranges enabled: [0.00] 0-9 write-back [0.00] A-B uncachable [0.00] C-F write-protect [0.00] MTRR variable ranges enabled: [0.00] 0 base 00C000 mask 7FC000 uncachable [0.00] 1 base 00A000 mask 7FE000 uncachable [0.00] 2 base 009000 mask 7FF000 uncachable [0.00] 3 base 008C00 mask 7FFC00 uncachable [0.00] 4 base 008A00 mask 7FFE00 uncachable [0.00] 5 base 008900 mask 7FFF00 uncachable [0.00] 6 base 008880 mask 7FFF80 uncachable [0.00] 7 base 008840 mask 7FFFC0 uncachable [
WARNING: CPU: 1 PID: 3016 at fs/btrfs/ctree.h:1564 btrfs_update_device+0x189/0x190 [btrfs]
Hi all, I get warnings as seen in attached dmesg.log. This is on 4.14.9. 4.9.72 runs flawless so far. ## Linux toy 4.14.9-toy-lxtec-amd64 #7 SMP Fri Dec 29 10:43:28 CET 2017 x86_64 GNU/Linux -- btrfs-progs v4.13.3 -- Label: 'TOY-RAID1' uuid: 32fc4ea0-0b26-478c-9b2e-b299d6289270 Total devices 2 FS bytes used 560.98GiB devid1 size 3.62TiB used 566.03GiB path /dev/sda3 devid2 size 3.62TiB used 566.03GiB path /dev/sdb3 -- Data, RAID1: total=563.00GiB, used=559.43GiB System, RAID1: total=32.00MiB, used=96.00KiB Metadata, RAID1: total=3.00GiB, used=1.55GiB GlobalReserve, single: total=512.00MiB, used=0.00B # I want to run this machine as a 24/7 server with the latest LT-KERNEL. So please investigate. Many thanks in advance Elimar -- 355/113: Not the famous irrational number pi, but an incredible simulation! -unknown -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 02/17] btrfs-progs: lowmem check: record returned errors after walk_down_tree_v2()
On 20.12.2017 06:57, Su Yue wrote: > In lowmem mode with '--repair', check_chunks_and_extents_v2() > will fix accounting in block groups and clear the error > bit BG_ACCOUNTING_ERROR. > However, return value of check_btrfs_root() is 0 either 1 instead of > error bits. > > If extent tree is on error, lowmem repair always prints error and > returns nonzero value even the filesystem is fine after repair. > > So let @err contains bits after walk_down_tree_v2(). > > Introduce FATAL_ERROR for lowmem mode to represents negative return > values since negative and positive can't not be mixed in bits operations. > > Signed-off-by: Su Yue> --- > cmds-check.c | 13 +++-- > 1 file changed, 7 insertions(+), 6 deletions(-) > > diff --git a/cmds-check.c b/cmds-check.c > index 309ac9553b3a..ebede26cef01 100644 > --- a/cmds-check.c > +++ b/cmds-check.c > @@ -134,6 +134,7 @@ struct data_backref { > #define DIR_INDEX_MISMATCH (1<<19) /* INODE_INDEX found but not match */ > #define DIR_COUNT_AGAIN (1<<20) /* DIR isize should be recalculated > */ > #define BG_ACCOUNTING_ERROR (1<<21) /* Block group accounting error */ > +#define FATAL_ERROR (1<<22) /* fatal bit for errno */ > > static inline struct data_backref* to_data_backref(struct extent_backref > *back) > { > @@ -6556,7 +6557,7 @@ static struct data_backref *find_data_backref(struct > extent_record *rec, > *otherwise means check fs tree(s) items relationship and > * @root MUST be a fs tree root. > * Returns 0 represents OK. > - * Returns not 0 represents error. > + * Returns > 0represents error bits. > */ What about the code in 'if (!check_all)' branch, check_fs_first_inode can return a negative value, hence check_btrfs_root can return a negative value. A negative value can also be returned from btrfs_search_slot. Clearly this patch needs to be thought out better > static int check_btrfs_root(struct btrfs_trans_handle *trans, > struct btrfs_root *root, unsigned int ext_ref, > @@ -6607,12 +6608,12 @@ static int check_btrfs_root(struct btrfs_trans_handle > *trans, > while (1) { > ret = walk_down_tree_v2(trans, root, , , , > ext_ref, check_all); > - > - err |= !!ret; > + if (ret > 0) > + err |= ret; > > /* if ret is negative, walk shall stop */ > if (ret < 0) { > - ret = err; > + ret = err | FATAL_ERROR; > break; > } > > @@ -6636,12 +6637,12 @@ out: > * @ext_ref: the EXTENDED_IREF feature > * > * Return 0 if no error found. > - * Return <0 for error. > + * Return not 0 for error. > */ > static int check_fs_root_v2(struct btrfs_root *root, unsigned int ext_ref) > { > reset_cached_block_groups(root->fs_info); > - return check_btrfs_root(NULL, root, ext_ref, 0); > + return !!check_btrfs_root(NULL, root, ext_ref, 0); > } You make the function effectively boolean, make this explicit by changing its return value to bool. Also the name and the boolean return makes the function REALLY confusing. I.e when should we return true or false? As it stands it return "false" on success and "true" otherwise, this is a mess... > > /* > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/7] blk-mq: make blk_abort_request() trigger timeout path
On Sat, Dec 16, 2017 at 04:07:23AM -0800, Tejun Heo wrote: > Note that this makes blk_abort_request() asynchronous - it initiates > abortion but the actual termination will happen after a short while, > even when the caller owns the request. AFAICS, SCSI and ATA should be > fine with that and I think mtip32xx and dasd should be safe but not > completely sure. It'd be great if people who know the drivers take a > look. For that you'll need to CC linux-ide and linux-scsi, and for the SAS drivers some of the usual suspects that touch the SAS code. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/7] blk-mq: protect completion path with RCU
Why do you need the srcu protection? The completion path can never sleep. If there is a good reason to keep it please add commment, and make the srcu variant a separate function only used by drivers that need it instead of adding the conditional. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET v3] blk-mq: reimplement timeout handling
This seems to miss the linux-block list once again. Please include it in the next resend. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] bytrfs-progs: Print error on invalid extent item format during check
While performing normal mode check if the code comes across an invalid extent format it will just BUG() and exit without printing any useful information for debugging. Improve the situation by outputting the key/leaf bytenr/slot which will enable to quickly inspect the tree and see what the corruption is. Signed-off-by: Nikolay Borisov--- cmds-check.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/cmds-check.c b/cmds-check.c index a93ac2c88a38..371516709ed8 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -8362,7 +8362,12 @@ static int process_extent_item(struct btrfs_root *root, if (item_size < sizeof(*ei)) { #ifdef BTRFS_COMPAT_EXTENT_TREE_V0 struct btrfs_extent_item_v0 *ei0; - BUG_ON(item_size != sizeof(*ei0)); + if (item_size != sizeof(*ei0)) { + error("invalid extent item format: ITEM[%llu %u %llu] leaf: %llu slot: %d", + key.objectid, key.type, key.offset, + btrfs_header_bytenr(eb), slot); + BUG(); + } ei0 = btrfs_item_ptr(eb, slot, struct btrfs_extent_item_v0); refs = btrfs_extent_refs_v0(eb, ei0); #else -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH bpf-next v2 1/4] tracing/kprobe: bpf: Check error injectable event is on function entry
On Thu, 28 Dec 2017 17:03:24 -0800 Alexei Starovoitovwrote: > On 12/28/17 12:20 AM, Masami Hiramatsu wrote: > > On Wed, 27 Dec 2017 20:32:07 -0800 > > Alexei Starovoitov wrote: > > > >> On 12/27/17 8:16 PM, Steven Rostedt wrote: > >>> On Wed, 27 Dec 2017 19:45:42 -0800 > >>> Alexei Starovoitov wrote: > >>> > I don't think that's the case. My reading of current > trace_kprobe_ftrace() -> arch_check_ftrace_location() > is that it will not be true for old mcount case. > >>> > >>> In the old mcount case, you can't use ftrace to return without calling > >>> the function. That is, no modification of the return ip, unless you > >>> created a trampoline that could handle arbitrary stack frames, and > >>> remove them from the stack before returning back to the function. > >> > >> correct. I was saying that trace_kprobe_ftrace() won't let us do > >> bpf_override_return with old mcount. > > > > No, trace_kprobe_ftrace() just checks the given address will be > > managed by ftrace. you can see arch_check_ftrace_location() in > > kernel/kprobes.c. > > > > FYI, CONFIG_KPROBES_ON_FTRACE depends on DYNAMIC_FTRACE_WITH_REGS, and > > DYNAMIC_FTRACE_WITH_REGS doesn't depend on CC_USING_FENTRY. > > This means if you compile kernel with old gcc and enable DYNAMIC_FTRACE, > > kprobes uses ftrace on mcount address which is NOT the entry point > > of target function. > > ok. fair enough. I think we can gate the feature to !mcount only. > > > On the other hand, changing IP feature has been implemented originaly > > by kprobes with int3 (sw breakpoint). This means you can use kprobes > > at correct address (the entry address of the function) you can hijack > > the function, as jprobe did. > > > As far as the rest of your arguments it very much puzzles me that > you claim that this patch suppose to work based on historical > reasoning whereas you did NOT test it. > >>> > >>> I believe that Masami is saying that the modification of the IP from > >>> kprobes has been very well tested. But I'm guessing that you still want > >>> a test case for using kprobes in this particular instance. It's not the > >>> implementation of modifying the IP that you are worried about, but the > >>> implementation of BPF using it in this case. Right? > >> > >> exactly. No doubt that old code works. > >> But it doesn't mean that bpf_override_return() will continue to > >> work in kprobes that are not ftrace based. > >> I suspect Josef's existing test case will cover this situation. > >> Probably only special .config is needed to disable ftrace, so > >> "kprobe on entry but not ftrace" check will kick in. > > > > Right. If you need to test it, you can run Josef's test case without > > CONFIG_DYNAMIC_FTRACE. > > It should be obvious that the person who submits the patch > must run the tests. > > >> But I didn't get an impression that this situation was tested. > >> Instead I see only logical reasoning that it's _supposed_ to work. > >> That's not enough. > > > > OK, so would you just ask me to run samples/bpf ? > > Please run Josef's test in the !ftrace setup. Yes, I'll add the result of the test case. Thank you, -- Masami Hiramatsu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html