Re: Slow dc3dd in 3.18 on x86
On 10/24/14 21:51, Martin K. Petersen wrote: "Michael" == Michael L Semon writes: Michael> There was nothing regarding integrity in /sys/block/sda. I was Michael> under the impression that both bio integrity and T10 checksums Michael> require hardware support from good hardware, so the config Michael> items have always been shut off. That's correct. But I see what's going on. Please try this: commit 8d331952d2cd341d5c0e64eee961f78f6eb4b968 Author: Martin K. Petersen Date: Fri Oct 24 21:39:12 2014 -0400 block: Fix merge logic when CONFIG_BLK_DEV_INTEGRITY is not defined Commit 4eaf99beadce switched to returning bool and as a result reversed the logic of the integrity merge checks. However, the empty stubs used when the block integrity code is compiled out were still returning 0. Make these stubs return "true". Reported-by: Michael L. Semon Signed-off-by: Martin K. Petersen diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 9fbf4d3196ed..7442c6b9187e 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1590,13 +1590,13 @@ static inline bool blk_integrity_merge_rq(struct request_queue *rq, struct request *r1, struct request *r2) { - return 0; + return true; } static inline bool blk_integrity_merge_bio(struct request_queue *rq, struct request *r, struct bio *b) { - return 0; + return true; } static inline bool blk_integrity_is_initialized(struct gendisk *g) { Excellent! All is well with the problem kernel. All is well with the night's git master + xfs-oss/for-next as well. The saved E-mail patched cleanly against both kernels using `git am`. About 40 GB of data was written, some using fio, most using dc3dd. No problems. I'll keep this patch and stand behind its good results so far. Thanks again! Michael -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Slow dc3dd in 3.18 on x86
On 10/24/14 15:54, Martin K. Petersen wrote: "Michael" == Michael L Semon writes: Michael> This week, a simple `dc3dd wipe=/dev/sda5` operation had speeds Michael> cut from 10-15 MB/s down to less than 1.8 MB/s. With this Michael> method, syncs took so long that magic SysRq keys were needed to Michael> stop the PC. A bisect let me here: That commit itself doesn't do anything. I was concerned that somehow integrity got enabled by accident and you were bogged down by checksum calculations but I see no evidence that this would be happening. Ran a few tests on ATA systems here. So I'm puzzled. Please let me know if the disk has an integrity profile in sysfs. It appears that I botched a `git bisect replay` when I was close to the end. I'm several commits off, and the actual commit seems to be this one, using the `git branch ...; get checkout ...` method: commit 4eaf99beadcefbf126fa05e66fb40fca999e09fd Author: Martin K. Petersen Date: Fri Sep 26 19:20:06 2014 -0400 block: Don't merge requests if integrity flags differ We'd occasionally merge requests with conflicting integrity flags. Introduce a merge helper which checks that the requests have compatible integrity payloads. Signed-off-by: Martin K. Petersen Reviewed-by: Christoph Hellwig Reviewed-by: Sagi Grimberg Signed-off-by: Jens Axboe There was nothing regarding integrity in /sys/block/sda. I was under the impression that both bio integrity and T10 checksums require hardware support from good hardware, so the config items have always been shut off. If these are really generic interfaces for all scsi/libata hardware, let me know! I'd love to try the new facility. The drive in question is this (from smartctl): Model Family: Western Digital Caviar Device Model: WDC WD600BB-75CAA0 Serial Number:WD-WMA8F2149190 Firmware Version: 16.06V16 User Capacity:60,022,480,896 bytes [60.0 GB] Sector Size: 512 bytes logical/physical Device is:In smartctl database [for details use: -P show] ATA Version is: 5 ATA Standard is: Exact ATA specification draft version not indicated This test machine is in its infancy, so the dwarf tools and perf aren't loaded yet. The closest I can get to error messages is to re-enact the dc3dd case: root@kyhorse:~# dc3dd wipe=/dev/sda5 dc3dd 7.1.614 started at 2014-10-24 18:35:08 -0400 compiled options: command line: dc3dd wipe=/dev/sda5 device size: 3145728 sectors (probed) sector size: 512 bytes (probed) 20709376 bytes (20 M) copied ( 1%), 0.101421 s, 195 M/s # starts off nicely, but give it a minute or two, and... 177569792 bytes (169 M) copied (11%), 88.0277 s, 1.9 M/s ^C^C^C # `killall dc3dd` did not work even when adding -4, -6, and -9; so use #the magic SysRq keys at the keyboard... [ 451.335299] SysRq : Terminate All Tasks [ 452.160255] SysRq : Kill All Tasks [ 453.015607] SysRq : Terminate All Tasks [ 453.443829] SysRq : Kill All Tasks [ 453.949758] SysRq : Emergency Sync Welcome to Linux 3.17.0-rc5+ (ttyS0) kyhorse login: [ 458.109242] SysRq : Emergency Remount R/O [ 461.656730] SysRq : Emergency Sync [ 468.111580] SysRq : Terminate All Tasks [ 468.818737] SysRq : Kill All Tasks [ 469.219577] SysRq : Terminate All Tasks [ 469.568710] SysRq : Kill All Tasks Welcome to Linux 3.17.0-rc5+ (ttyS0) kyhorse login: root Password: Last login: Fri Oct 24 18:34:45 -0400 2014 on /dev/ttyS0. root@kyhorse:~# shutdown -h now Broadcast message from root@kyhorse (ttyS0) (Fri Oct 24 18:37:42 2014): The system is going down for system halt NOW! # it stalled for several minutes, so use the SysRq keys again... [ 505.694083] SysRq : DEBUG Entering kdb (current=0xc159ee40, pid 0) on processor 0 due to Keyboard Entry [0]kdb> ps 35 sleeping system daemon (state M) processes suppressed, use 'ps A' to see all. Task Addr Pid Parent [*] cpu State Thread Command 0xc159ee4000 10 R 0xc159f0f0 *swapper/0 0xef0510 00 S 0xef0502b0 init 0xef05159062 00 D 0xef051840 kworker/u2:0 0xef101590 202 00 D 0xef101840 kworker/0:1 0xee506f70 1091 00 D 0xee507220 udevd 0xeebdccf0 3571 00 Z 0xeebdcfa0 dc3dd 0xeebde6d0 3591 00 D 0xeebde980 dc3dd 0xeebdeb20 3622 00 D 0xeebdedd0 kworker/0:2 0xee584450 3922 00 D 0xee584700 kworker/0:3 0xeebdc8a0 4151 00 S 0xeebdcb50 agetty 0xeebdd140 4161 00 S 0xeebdd3f0 agetty 0xeebdf3c0 4171 00 S 0xeebdf670 bash 0xeebdd590 4181 00 S 0xeebdd840 agetty 0xeebdef70 4191 00 S 0
Slow dc3dd in 3.18 on x86
Hi! I have an old i686 Pentium 4 that I use for xfstests. To better keep integrity, write cache is disabled on an old 60-"megabyte" IDE HDD. The PC runs slackware-current, doing a `git pull` of the kernel and xfs-oss/for-next once or twice a week. This week, a simple `dc3dd wipe=/dev/sda5` operation had speeds cut from 10-15 MB/s down to less than 1.8 MB/s. With this method, syncs took so long that magic SysRq keys were needed to stop the PC. A bisect let me here: root@kyhorse:/usr/src/kernel-git/linux# git bisect good aae7df50190a640e51bfe11c93f94741ac82ff0b is the first bad commit commit aae7df50190a640e51bfe11c93f94741ac82ff0b Author: Martin K. Petersen Date: Fri Sep 26 19:20:05 2014 -0400 block: Integrity checksum flag Make the choice of checksum a per-I/O property by introducing a flag that can be inspected by the SCSI layer. There are several reasons for this: 1. It allows us to switch choice of checksum without unloading and reloading the HBA driver. 2. During error recovery we need to be able to tell the HBA that checksums read from disk should not be verified and converted to IP checksums. 3. For error injection purposes we need to be able to write a bad guard tag to storage. Since the storage device only supports T10 CRC we need to be able to disable IP checksum conversion on the HBA. Signed-off-by: Martin K. Petersen Reviewed-by: Sagi Grimberg Signed-off-by: Jens Axboe :04 04 9a0fd5dc52f1384280e8cfea63fef7951db9a4d2 2d6ce5012ce8264b82772910060cc97001a30a80 M block :04 04 2ef62fa822934877285dd0ea6ed4bc154b3fb4e4 e9935ccb2fe0fe62bc0d925c7c4eb3291f227b42 M drivers :04 04 f4127155cd44a1ad376b1e193263a8eeb6aa267d b158970e89b03396c436dc471d83e8d4c3f96969 M include Uh, OK, but I don't use the integrity features at all. When configuring the kernel, the "Enable the block layer" section has only CONFIG_LBDAF=y selected in that first- level menu. The T10 items aren't configured elsewhere (Cryptographic API, etc.). Do I need to be setting some config items to "y" to get this boat anchor to zero partitions at its usual slow rate again? Should you find such an issue and need a relatively safe test, you can use this fio job file: # start of job file: [global] filename=/dev/sda5 fill_device=1 bs=64k numjobs=1 zero_buffers=1 [write] rw=write write_bw_log=write write_iops_log=write write_lat_log=write # end of job file. On this boat anchor, the claimed bandwidth will toggle between 447 kB/s and 512 kB/s for an affected setup. IOPS are between 1 and 10 for an affected setup. Thanks! If somehow I landed on the wrong commit, let me know, and I'll try again. Thanks! Michael -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.14.0+/x86: lockdep and mutexes not getting along
On Wed, 9 Apr 2014, Jason Low wrote: > On Wed, 2014-04-09 at 15:19 +0300, Kirill A. Shutemov wrote: > > On Sun, Apr 06, 2014 at 01:12:14AM -0400, Michael L. Semon wrote: > > > Hi! Starting early in this merge window for 3.15, lockdep has been > > > giving me trouble. Normally, a splat will happen, lockdep will shut > > > itself off, and my i686 Pentium 4 PC will continue. Now, after the > > > splat, it will allow one key of input at either a VGA console or over > > > serial. After that, only the magic SysRq keys and KDB still work. > > > File activity stops, and many processes are stuck in the D state. > > > > > > Bisect brought me here: > > > > > > root@plbearer:/usr/src/kernel-git/linux# git bisect good > > > 6f008e72cd111a119b5d8de8c5438d892aae99eb is the first bad commit > > > commit 6f008e72cd111a119b5d8de8c5438d892aae99eb > > > Author: Peter Zijlstra > > > Date: Wed Mar 12 13:24:42 2014 +0100 > > > > > > locking/mutex: Fix debug checks > > > > > > OK, so commit: > > > > > > 1d8fe7dc8078 ("locking/mutexes: Unlock the mutex without the > > > wait_lock") > > > > > > generates this boot warning when CONFIG_DEBUG_MUTEXES=y: > > > > > > WARNING: CPU: 0 PID: 139 at > > > /usr/src/linux-2.6/kernel/locking/mutex-debug.c:82 > > > debug_mutex_unlock+0x155/0x180() DEBUG_LOCKS_WARN_ON(lock->owner != > > > current) > > > > > > And that makes sense, because as soon as we release the lock a > > > new owner can come in... > > > > > > One would think that !__mutex_slowpath_needs_to_unlock() > > > implementations suffer the same, but for DEBUG we fall back to > > > mutex-null.h which has an unconditional 1 for that. > > > > > > The mutex debug code requires the mutex to be unlocked after > > > doing the debug checks, otherwise it can find inconsistent > > > state. > > > > > > Reported-by: Ingo Molnar > > > Signed-off-by: Peter Zijlstra > > > Cc: jason.l...@hp.com > > Hello, > > As a starting point, would either of you like to test the following > patch to see if it fixes the issue? This patch essentially generates the > same code as in older kernels in the debug case. This applies on top of > kernels with both commits 6f008e72cd11 and 1d8fe7dc8078. > > Thanks. > > - > diff --git a/kernel/locking/mutex-debug.c b/kernel/locking/mutex-debug.c > index e1191c9..faf6f5b 100644 > --- a/kernel/locking/mutex-debug.c > +++ b/kernel/locking/mutex-debug.c > @@ -83,12 +83,6 @@ void debug_mutex_unlock(struct mutex *lock) > > DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next); > mutex_clear_owner(lock); > - > - /* > - * __mutex_slowpath_needs_to_unlock() is explicitly 0 for debug > - * mutexes so that we can do it here after we've verified state. > - */ > - atomic_set(&lock->count, 1); > } > > void debug_mutex_init(struct mutex *lock, const char *name, > diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c > index bc73d33..f1f672e 100644 > --- a/kernel/locking/mutex.c > +++ b/kernel/locking/mutex.c > @@ -34,13 +34,6 @@ > #ifdef CONFIG_DEBUG_MUTEXES > # include "mutex-debug.h" > # include > -/* > - * Must be 0 for the debug case so we do not do the unlock outside of the > - * wait_lock region. debug_mutex_unlock() will do the actual unlock in this > - * case. > - */ > -# undef __mutex_slowpath_needs_to_unlock > -# define __mutex_slowpath_needs_to_unlock() 0 > #else > # include "mutex.h" > # include > @@ -688,6 +681,17 @@ __mutex_unlock_common_slowpath(atomic_t *lock_count, int > nested) > unsigned long flags; > > /* > + * In the debug cases, obtain the wait_lock first > + * before calling the following debugging functions. > + */ > +#if defined(CONFIG_DEBUG_MUTEXES) || defined(CONFIG_DEBUG_LOCK_ALLOC) > + spin_lock_mutex(&lock->wait_lock, flags); > +#endif > + > + mutex_release(&lock->dep_map, nested, _RET_IP_); > + debug_mutex_unlock(lock); > + > + /* >* some architectures leave the lock unlocked in the fastpath failure >* case, others need to leave it locked. In the later case we have to >* unlock it here > @@ -695,9 +699,9 @@ __mutex_unlock_common_slowpath(atomic_t *lock_count, int > nested) > if (__mutex_slowpath_needs_t
3.14.0+/x86: lockdep and mutexes not getting along
Hi! Starting early in this merge window for 3.15, lockdep has been giving me trouble. Normally, a splat will happen, lockdep will shut itself off, and my i686 Pentium 4 PC will continue. Now, after the splat, it will allow one key of input at either a VGA console or over serial. After that, only the magic SysRq keys and KDB still work. File activity stops, and many processes are stuck in the D state. Bisect brought me here: root@plbearer:/usr/src/kernel-git/linux# git bisect good 6f008e72cd111a119b5d8de8c5438d892aae99eb is the first bad commit commit 6f008e72cd111a119b5d8de8c5438d892aae99eb Author: Peter Zijlstra Date: Wed Mar 12 13:24:42 2014 +0100 locking/mutex: Fix debug checks OK, so commit: 1d8fe7dc8078 ("locking/mutexes: Unlock the mutex without the wait_lock") generates this boot warning when CONFIG_DEBUG_MUTEXES=y: WARNING: CPU: 0 PID: 139 at /usr/src/linux-2.6/kernel/locking/mutex-debug.c:82 debug_mutex_unlock+0x155/0x180() DEBUG_LOCKS_WARN_ON(lock->owner != current) And that makes sense, because as soon as we release the lock a new owner can come in... One would think that !__mutex_slowpath_needs_to_unlock() implementations suffer the same, but for DEBUG we fall back to mutex-null.h which has an unconditional 1 for that. The mutex debug code requires the mutex to be unlocked after doing the debug checks, otherwise it can find inconsistent state. Reported-by: Ingo Molnar Signed-off-by: Peter Zijlstra Cc: jason.l...@hp.com Link: http://lkml.kernel.org/r/20140312122442.gb27...@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar :04 04 80e40c2009942a31f98127c4f9fa958f34b3947b f46ed4b70c4f30fc665fe8f810d3c13920cd765a M kernel Indeed, my issues are solved (so far) simply by reverting this commit. Might someone test lockdep on x86 to see if this is a consistent issue that needs to be adjusted? My lockdep splats are generated by running xfstests test generic/113 on XFS, but splats caused by other issues should still create the same symptoms. Otherwise, this 3.15 kernel has been rather kind to me so far. PC is an i686 Pentium 4 with 1280 MB RAM and old IDE hardware, running Slackware 14.1. Thanks! Michael -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] jfs: fix generic posix ACL regression
Looks good. I'm keeping the patch. It was placed through the following tests: *) the test suite from the January 19 git acl, as root; (git://git.savannah.nongnu.org/acl.git) *) the acl test suite, as a regular user; *) xfstests (full run); and *) some fs_mark and LTP fsstress in a dir populated with three default ACLs. Tests were done on an i686 Pentium 4 PC, using kernels 3.14.0-rc1+ and 3.13.0+. Comparisons were made with standard XFS. Due to non-JFS issues, lockdep is off and CONFIG_AIO=n, meaning that their effects are not represented here. The xfstests run for JFS looked unchanged over kernel 3.13.0+, with the exeption of the error "+error: ctime not updated after setfacl" from generic/307. The test uses integer seconds to compare ctime, so I inserted some regular stat commands into the test. The nanosecond portion of the timestamps also do not change. JFS from 3.13.0+ passed this test; XFS from 3.14.0-rc1+ passed this test as well. For JFS, the new generic ACL infrastructure provides a reduction of 99 failures from the acl test suite, most of them from the misc.test series of tests. Pass or fail, results were equal to that of XFS, which seemed to gain 10 errors from the root/*.test tests. Thanks! Christoph's generic ACL work now makes much more sense to me than it did at this time yesterday. The prompt patch from Shaggy allows me to use it on JFS as well :-) Michael On Fri, 7 Feb 2014, Dave Kleikamp wrote: I missed a couple errors in reviewing the patches converting jfs to use the generic posix ACL function. Setting ACL's currently fails with -EOPNOTSUPP. Signed-off-by: Dave Kleikamp Reported-by: Michael L. Semon Cc: Christoph Hellwig --- fs/jfs/xattr.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/fs/jfs/xattr.c b/fs/jfs/xattr.c index 3bd5ee4..46325d5 100644 --- a/fs/jfs/xattr.c +++ b/fs/jfs/xattr.c @@ -854,9 +854,6 @@ int jfs_setxattr(struct dentry *dentry, const char *name, const void *value, int rc; tid_t tid; - if ((rc = can_set_xattr(inode, name, value, value_len))) - return rc; - /* * If this is a request for a synthetic attribute in the system.* * namespace use the generic infrastructure to resolve a handler @@ -865,6 +862,9 @@ int jfs_setxattr(struct dentry *dentry, const char *name, const void *value, if (!strncmp(name, XATTR_SYSTEM_PREFIX, XATTR_SYSTEM_PREFIX_LEN)) return generic_setxattr(dentry, name, value, value_len, flags); + if ((rc = can_set_xattr(inode, name, value, value_len))) + return rc; + if (value == NULL) {/* empty EA, do not remove */ value = ""; value_len = 0; @@ -1034,9 +1034,6 @@ int jfs_removexattr(struct dentry *dentry, const char *name) int rc; tid_t tid; - if ((rc = can_set_xattr(inode, name, NULL, 0))) - return rc; - /* * If this is a request for a synthetic attribute in the system.* * namespace use the generic infrastructure to resolve a handler @@ -1045,6 +1042,9 @@ int jfs_removexattr(struct dentry *dentry, const char *name) if (!strncmp(name, XATTR_SYSTEM_PREFIX, XATTR_SYSTEM_PREFIX_LEN)) return generic_removexattr(dentry, name); + if ((rc = can_set_xattr(inode, name, NULL, 0))) + return rc; + tid = txBegin(inode->i_sb, 0); mutex_lock(&ji->commit_mutex); rc = __jfs_setxattr(tid, dentry->d_inode, name, NULL, 0, XATTR_REPLACE); @@ -1061,7 +1061,7 @@ int jfs_removexattr(struct dentry *dentry, const char *name) * attributes are handled directly. */ const struct xattr_handler *jfs_xattr_handlers[] = { -#ifdef JFS_POSIX_ACL +#ifdef CONFIG_JFS_POSIX_ACL &posix_acl_access_xattr_handler, &posix_acl_default_xattr_handler, #endif -- 1.8.5.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
code is fine but needs lockdep annotation (xfstests, x86, 3.13.0-rc3)
Hi! Since a particular aio-next merge during the kernel 3.12 development cycle, I've had issues with AIO and xfstests. The typical stalled test would end up stalled, and a SysRq-g would help me get to this stack trace (just for context, the lockdep I'm reporting is one page down): kdb> btp 1176 Stack traceback for pid 1176 0xdef27cb0 1176 941 00 D 0xdef27e70 fsx d427fdbc 0086 def26660 0082 0001 def27cb0 d427e000 def27cb0 df42ca00 df426a00 d427fd80 c103cc7d dec2b880 d427fd88 c1030631 d427fda0 c1031048 003d df42ca00 dec2b880 df426a00 d427fdd0 c103115b Call Trace: [] ? wake_up_process+0x1a/0x2e [] ? wake_up_worker+0x19/0x1b [] ? insert_work+0x4f/0xa5 [] ? __queue_work+0xbd/0x1c0 [] schedule+0x1d/0x47 # always [] schedule_timeout+0x99/0xf3 # always [] ? __internal_add_timer+0x99/0x99 [] io_schedule_timeout+0x37/0x48 [] balance_dirty_pages.isra.33+0x2e1/0x367# always [] balance_dirty_pages_ratelimited+0x9d/0xb9 # always [] do_wp_page.isra.93+0x355/0x510 [] __handle_mm_fault+0x2b8/0x555 [] ? vmalloc_sync_all+0xdb/0xdb [] ? common_interrupt+0x30/0x35 [] handle_mm_fault+0x1e/0x24 [] __do_page_fault+0x12e/0x3bf [] ? irq_exit+0x37/0x5f [] ? do_IRQ+0x3d/0x84 [] ? vmalloc_sync_all+0xdb/0xdb [] do_page_fault+0x8/0xf [] error_code+0x65/0x6c [] ? __rpc_unlink+0xd/0x34 With the new AIO fixes, runtime endurance has increased greatly. Usually, a test session will stall like the above, though balance_dirty_pages() is in the KDB stack trace only 25% now, and the failed test will be one of the specialized AIO and/or DIO tests, not just fsx or fsstress. This is regardless of whether XFS, JFS, or NILFS2 is being run through xfstests. In the process of weeding out the tests that almost always cause a stall or lockup, I've gotten this message sent to remote syslog on 4 different occasions: # xfstests generic/208 is described as "run aio-dio-invalidate-failure - # test race in read cache invalidation." The test is using devel XFS. logger: run xfstest generic/208 INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 0 PID: 0 Comm: swapper Not tainted 3.13.0-rc3+ #1 Hardware name: Dell Computer Corporation Dimension 2350/07W080, BIOS A01 12/17/2002 0002 0002 df40bd24 d0504e1d df40bd30 d050047e d060f960 df40bda4 d004d5b4 0122 64dd6f07 0e6f a7c10e62 19e9 01f6 df40bd84 d004501f 61e23bf6 de391350 d068f4a0 61e23bf6 Call Trace: [] dump_stack+0x16/0x18 [] register_lock_class.part.42+0x32/0x36 [] __lock_acquire+0x175a/0x1897 [] ? sched_clock_local.constprop.3+0x39/0x131 [] lock_acquire+0x70/0x99 [] ? aio_complete+0x73/0x281 [] _raw_spin_lock_irqsave+0x45/0x75 [] ? aio_complete+0x73/0x281 [] aio_complete+0x73/0x281 [] ? mempool_free+0x3b/0x76 [] ? local_clock+0x3d/0x58 [] ? xfs_destroy_ioend+0x35/0x39 [] ? xfs_end_io+0x2d/0xe6 [] dio_complete+0x7f/0x106 [] dio_bio_end_aio+0x66/0xda [] bio_endio+0x14/0x26 [] blk_update_request+0x73/0x2ce [] ? sched_clock_cpu+0x8f/0xe2 [] blk_update_bidi_request+0xe/0x56 [] blk_end_bidi_request+0x1d/0x5d [] blk_end_request+0x12/0x14 [] scsi_io_completion+0x83/0x536 [] ? scsi_device_unbusy+0x91/0x99 [] scsi_finish_command+0x92/0xc1 [] scsi_softirq_done+0xda/0xf7 [] ? __do_softirq+0x72/0x1a1 [] ? _local_bh_enable+0x3c/0x3c [] blk_done_softirq+0x65/0x73 [] __do_softirq+0xae/0x1a1 [] ? _local_bh_enable+0x3c/0x3c [] ? irq_exit+0x65/0x67 [] ? do_IRQ+0x3d/0x97 [] ? common_interrupt+0x35/0x3a [] ? default_idle+0xa/0xc [] ? arch_cpu_idle+0x1a/0x21 [] ? cpu_startup_entry+0x75/0x12c [] ? rest_init+0xb1/0xb7 [] ? rest_init+0x36/0xb7 [] ? start_kernel+0x2f1/0x2f7 [] ? repair_env_string+0x51/0x51 [] ? i386_start_kernel+0x12e/0x131 Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 0 CPU: 0 PID: 0 Comm: swapper Not tainted 3.13.0-rc3+ #1 Hardware name: Dell Computer Corporation Dimension 2350/07W080, BIOS A01 12/17/2002 d006ec78 d006ec78 df40bba4 d0504e1d df40bbc4 d04ffdfb d060e730 d08791c0 d0075aa8 d006ec78 0019 df45c400 df40bbdc d006ed23 d061389c 0001 df40bc30 d0076b33 df40bc20 d0052d8a Call Trace: [] ? update_timers_all_cpus+0x52/0x52 [] ? update_timers_all_cpus+0x52/0x52 [] dump_stack+0x16/0x18 [] panic+0x82/0x174 [] ? perf_event_update_userpage+0x114/0x16c [] ? update_timers_all_cpus+0x52/0x52 [] watchdog_overflow_callback+0xab/0xab [] __perf_event_overflow+0xa7/0x35c [] ? vprintk_emit+0x19d/0x46d [] ? x86_perf_event_set_period+0x11e/0x1e9 [] perf_event_overflow+0x15/0x17 [] p4_pmu_handle_irq+0x137/0x1ec [] ? i386_start_kernel+0x12e/0x131 [] ? print_context_stack+0x58/0x92 [] perf_event_nmi_handler+0x26/0x40 [] nmi_handle.isra.1+0x7e/0x168 [] ? nmi_handle.isra.1+0x24/0x168 [] ? show_stack+0
Re: [REGRESSION] x86 vmalloc issue from recent 3.10.0+ commit
Thanks. I'll re-review this, anyway, and re-bisect if time allows. The kernel/SGI-XFS combo pulled last night did much better in this regard. The problem is down to a different and single backtrace about vmalloc, and the PC is controllable now. The old git was moved to a different folder, though, in case it's still needed. Michael On Tue, Jul 9, 2013 at 9:59 PM, Dave Jones wrote: > On Tue, Jul 09, 2013 at 09:51:32PM -0400, Michael L. Semon wrote: > > > kernel: [ 2580.395592] vmap allocation for size 20480 failed: use > vmalloc= to increase size. > > kernel: [ 2580.395761] vmalloc: allocation failure: 16384 bytes > > I was seeing a lot of these recently too. > (Though I also saw memory corruption afterwards possibly caused by > a broken fallback path somewhere when that vmalloc fails) > > http://comments.gmane.org/gmane.linux.kernel.mm/102895 > > Dave > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[REGRESSION] x86 vmalloc issue from recent 3.10.0+ commit
Hi! I'm doing volunteer testing of xfstests and was sent here to ask about this issue. I apologize in advance if the problem has already been solved... I've been testing XFS from various git kernels on 32-bit Pentium 4 and Pentium III PCs. There was an issue with xfstests test xfs/167, which is one of many tests that run a lot of instances of the program "fsstress" to try to break something. Usually, the test passes, but this time, the Pentium 4 got stuck in a loop, and the Pentium III had processes killed but didn't seem to have resources released back to the system. The solution was to bisect the kernel to find the problem commit, get a patch out of it, then use the patch to reverse the commit. The kernel git used was pulled on July 7. SGI's xfs-oss/master was updated as well, and these additional XFS patches were applied: xfs: clean up unused codes at xfs_bulkstat() xfs: dquot log reservations are too small xfs: remove local fork format handling from xfs_bmapi_write() xfs: update mount options documentation Hopefully, such merging and patching won't be needed to reproduce the problem on your end. The problem is 100% reproducible here. The rest of this letter is supplementary data from the Pentium 4 PC. Thanks! Michael The partition used in the test was this (from `gdisk /dev/sdb`): 57018905690046463 9.5 GiB 8300 gScratchDev The original/fixed test behaviors look like this to xfstests: root@plbearer:/var/lib/xfstests# ./check xfs/167 FSTYP -- xfs (debug) PLATFORM -- Linux/i686 plbearer 3.10.0+ MKFS_OPTIONS -- -f -bsize=4096 /dev/sdb5 MOUNT_OPTIONS -- /dev/sdb5 /mnt/xfstests-scratch xfs/167 922s ... 891s Ran: xfs/167 Passed all 1 tests On a failing test, the hard drive light dies after a while, and it's impossible to switch framebuffer consoles (i915). This is the beginning of the infinite loop started by hitting Alt-Shift-SysRq-e- i-e-i-s-u-s, captured over netconsole (the first SysRq-s seems to be the trigger): logger: run xfstest xfs/167 kernel: [ 2497.774818] XFS (sdb5): Version 5 superblock detected. This kernel has EXPERIMENTAL support enabled! kernel: [ 2497.774818] Use of these features in this kernel is at your own risk! kernel: [ 2497.862312] XFS (sdb5): Mounting Filesystem kernel: [ 2580.395592] vmap allocation for size 20480 failed: use vmalloc= to increase size. kernel: [ 2580.395761] vmalloc: allocation failure: 16384 bytes kernel: [ 2580.395769] fsstress: page allocation failure: order:0, mode:0x80d2 kernel: [ 2580.395776] CPU: 0 PID: 6262 Comm: fsstress Not tainted 3.10.0+ #1 kernel: [ 2580.395781] Hardware name: Dell Computer Corporation Dimension 2350/07W080, BIOS A01 12/17/2002 kernel: [ 2580.395785] 0001 0001 c50b3bfc c14825b2 c50b3c24 c10a1c6c c15c1f70 ee319bb4 kernel: [ 2580.395802] 80d2 c50b3c38 c15c34a4 c50b3c14 fffa c50b3c54 c10c3243 kernel: [ 2580.395817] 80d2 c15c34a4 4000 c6bffb50 4000 c6bffb80 f06f kernel: [ 2580.395832] Call Trace: kernel: [ 2580.395847] [] dump_stack+0x16/0x18 kernel: [ 2580.395859] [] warn_alloc_failed+0xb4/0xe7 kernel: [ 2580.395868] [] __vmalloc_node_range+0x16d/0x1cf kernel: [ 2580.395875] [] __vmalloc_node+0x48/0x4f kernel: [ 2580.395884] [] ? kmem_zalloc_greedy+0x21/0x2c kernel: [ 2580.395890] [] vzalloc+0x30/0x32 kernel: [ 2580.395897] [] ? kmem_zalloc_greedy+0x21/0x2c kernel: [ 2580.395904] [] kmem_zalloc_greedy+0x21/0x2c kernel: [ 2580.395913] [] xfs_bulkstat+0x12a/0x94b kernel: [ 2580.395921] [] ? lock_release_non_nested+0xa0/0x2b7 kernel: [ 2580.395931] [] ? might_fault+0x7c/0x9b kernel: [ 2580.395938] [] ? might_fault+0x49/0x9b kernel: [ 2580.395945] [] ? might_fault+0x93/0x9b kernel: [ 2580.395954] [] ? _copy_from_user+0x3f/0x57 kernel: [ 2580.395961] [] xfs_ioc_bulkstat+0xba/0x15a kernel: [ 2580.395968] [] ? xfs_bulkstat_one_int+0x2ff/0x2ff kernel: [ 2580.395975] [] xfs_file_ioctl+0x6b9/0xa0d kernel: [ 2580.395984] [] ? dput+0x2d/0x263 kernel: [ 2580.395990] [] ? dput+0x219/0x263 kernel: [ 2580.395999] [] ? _raw_spin_unlock+0x22/0x30 kernel: [ 2580.396006] [] ? dput+0x219/0x263 kernel: [ 2580.396013] [] ? mntput+0x1d/0x28 kernel: [ 2580.396022] [] ? terminate_walk+0x63/0x66 kernel: [ 2580.396030] [] ? do_last+0x1a9/0xbfa kernel: [ 2580.396036] [] ? link_path_walk+0x54/0x6c2 kernel: [ 2580.396044] [] ? path_openat+0xaf/0x515 kernel: [ 2580.396053] [] ? __fd_install+0x1f/0x4a kernel: [ 2580.396060] [] ? xfs_ioc_getbmapx+0x9b/0x9b kernel: [ 2580.396068] [] do_vfs_ioctl+0x2f6/0x4cc kernel: [ 2580.396076] [] ? __fd_install+0x40/0x4a kernel: [ 2580.396083] [] ? _raw_spin_unlock+0x22/0x30 kernel: [ 2580.396090] [] ? final_putname+0x1d/0x36 kernel: [ 2580.396097] [] ? final_putname+0x1d/0x36 kernel: [ 2580.396104] [] ? putname+0x23/0x2f kernel: [ 2580.396112] [] ? do_sys_open+0x17d/0x1d8 kernel: [ 2580.396120] [] ? restore_all+0xf/0xf kernel: [ 2580.396127] [] SyS_ioc