Re: sysfs: write returns ENOMEM?
On 8/23/05, Nathan Scott [EMAIL PROTECTED] wrote: FWIW, all filesystems using the generic page cache routines are able to return this - see mm/filemap.c - generic_file_buffered_write... I don't think it makes much sense to fix this in individual filesystems as many functions returning -NOMEM can be used in other paths as well where they're ok. Andrew, please consider picking this up for -mm. (I've included it as an attachment as well as gmail will surely mess up the patch. Sorry.) Pekka [PATCH] VFS: return ENOBUFS instead of ENOMEM for vfs_write() As noticed by Dmitry Torokhov, write() can not return ENOMEM: http://www.opengroup.org/onlinepubs/95399/functions/write.html Currently almost all filesystems can return -ENOMEM due to generic_file_buffered_write() in mm/filemap.c so filter out the invalid error code in vfs_write(). Signed-off-by: Pekka Enberg [EMAIL PROTECTED] --- read_write.c |2 ++ 1 files changed, 2 insertions(+) Index: 2.6-mm/fs/read_write.c === --- 2.6-mm.orig/fs/read_write.c +++ 2.6-mm/fs/read_write.c @@ -310,6 +310,8 @@ ssize_t vfs_write(struct file *file, con } } + if (ret == -ENOMEM) + ret = -ENOBUFS; return ret; } [PATCH] VFS: return ENOBUFS instead of ENOMEM for vfs_write() As noticed by Dmitry Torokhov, write() can not return ENOMEM: http://www.opengroup.org/onlinepubs/95399/functions/write.html Currently almost all filesystems can return -ENOMEM due to generic_file_buffered_write() in mm/filemap.c so filter out the invalid error code in vfs_write(). Signed-off-by: Pekka Enberg [EMAIL PROTECTED] --- read_write.c |2 ++ 1 files changed, 2 insertions(+) Index: 2.6-mm/fs/read_write.c === --- 2.6-mm.orig/fs/read_write.c +++ 2.6-mm/fs/read_write.c @@ -310,6 +310,8 @@ ssize_t vfs_write(struct file *file, con } } + if (ret == -ENOMEM) + ret = -ENOBUFS; return ret; }
Re: sysfs: write returns ENOMEM?
Pekka Enberg [EMAIL PROTECTED] wrote: @@ -310,6 +310,8 @@ ssize_t vfs_write(struct file *file, con } } + if (ret == -ENOMEM) + ret = -ENOBUFS; return ret; } That's lame. It'd be better to hunt down all the -ENOMEMs and fix them up. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: sysfs: write returns ENOMEM?
Andrew Morton writes: That's lame. It'd be better to hunt down all the -ENOMEMs and fix them up. So there's our verdict. Thanks, Andrew :-) Pekka - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.12 Performance problems
Danial Thom wrote: --- Jesper Juhl [EMAIL PROTECTED] wrote: On 8/21/05, Danial Thom [EMAIL PROTECTED] wrote: I just started fiddling with 2.6.12, and there seems to be a big drop-off in performance from 2.4.x in terms of networking on a uniprocessor system. Just bridging packets through the machine, 2.6.12 starts dropping packets at ~100Kpps, whereas 2.4.x doesn't start dropping until over 350Kpps on the same hardware (2.0Ghz Opteron with e1000 driver). This is pitiful prformance for this hardware. I've increased the rx ring in the e1000 driver to 512 with little change (interrupt moderation is set to 8000 Ints/second). Has tuning for MP destroyed UP performance altogether, or is there some tuning parameter that could make a 4-fold difference? All debugging is off and there are no messages on the console or in the error logs. The kernel is the standard kernel.org dowload config with SMP turned off and the intel ethernet card drivers as modules without any other changes, which is exactly the config for my 2.4 kernels. If you have preemtion enabled you could disable it. Low latency comes at the cost of decreased throughput - can't have both. Also try using a HZ of 100 if you are currently using 1000, that should also improve throughput a little at the cost of slightly higher latencies. I doubt that it'll do any huge difference, but if it does, then that would probably be valuable info. Ok, well you'll have to explain this one: Low latency comes at the cost of decreased throughput - can't have both Configuring preempt gives lower latency, because then almost anything can be interrupted (preempted). You can then get very quick responses to some things, i.e. interrupts and such. The cost comes, because _something_ was interrupted, something that instead would run to completion first in a kernel made without preempt. So that other thing, whatever it is, got slower. And the problem is bigger than merely things happens in a different order. Switching the cpu from one job to another have a big overhead. Particularly, the cpu caches have to be refilled more often, which takes time. Running one big job to completion fills the cache with that job's data _once_. If the job is preempted a couple of times you have to bring it into cache three times instead, and that will cost you, performance wise. This is not _necessarily_ your problem, but trying a 2.6 kernel without preempt and with hz=100 (both things configurable through normal kernel configuration) will clearly show if this is the problem in your case. If you're lucky, this is all you need to get your performance back. If not, then at least it is an important datapoint for those trying to figure it out. Nobody here want 2.6 to have 1/4 of the performance of 2.4! Helge Hafting - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mm: return ENOBUFS instead of ENOMEM in generic_file_buffered_write
As noticed by Dmitry Torokhov, write() can not return ENOMEM: http://www.opengroup.org/onlinepubs/95399/functions/write.html Therefore fixup generic_file_buffered_write() in mm/filemap.c (pointed out by Nathan Scott). Signed-off-by: Pekka Enberg [EMAIL PROTECTED] --- filemap.c |2 +- 1 files changed, 1 insertion(+), 1 deletion(-) Index: 2.6-mm/mm/filemap.c === --- 2.6-mm.orig/mm/filemap.c +++ 2.6-mm/mm/filemap.c @@ -1942,7 +1942,7 @@ generic_file_buffered_write(struct kiocb page = __grab_cache_page(mapping,index,cached_page,lru_pvec); if (!page) { - status = -ENOMEM; + status = -ENOBUFS; break; } - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13-rc6-mm2
On Mon, 22 Aug 2005 21:30:21 -0700, Andrew Morton [EMAIL PROTECTED] wrote: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc6/2.6.13-rc6-mm2/ - Various updates. Nothing terribly noteworthy. adm9240 i2c still broken, spamming debug with: Aug 23 18:48:40 peetoo kernel: [ 1591.151460] i2c_adapter i2c-0: Transaction (post): CNT=08, CMD=2c, ADD=5a, DAT0=00, DAT1=00 Aug 23 18:48:40 peetoo kernel: [ 1591.151834] i2c_adapter i2c-0: Transaction (pre): CNT=08, CMD=2c, ADD=5a, DAT0=00, DAT1=00 Aug 23 18:48:40 peetoo kernel: [ 1591.170515] i2c_adapter i2c-0: Transaction (post): CNT=08, CMD=2c, ADD=5a, DAT0=00, DAT1=00 Aug 23 18:48:40 peetoo kernel: [ 1591.170881] i2c_adapter i2c-0: Transaction (pre): CNT=08, CMD=2c, ADD=5a, DAT0=00, DAT1=00 Aug 23 18:48:40 peetoo kernel: [ 1591.189837] i2c_adapter i2c-0: Transaction (post): CNT=08, CMD=2c, ADD=5a, DAT0=00, DAT1=00 Aug 23 18:48:40 peetoo kernel: [ 1591.190217] i2c_adapter i2c-0: Transaction (pre): CNT=08, CMD=2c, ADD=5a, DAT0=00, DAT1=00 Aug 23 18:48:40 peetoo kernel: [ 1591.208927] i2c_adapter i2c-0: Transaction (post): CNT=08, CMD=2c, ADD=5a, DAT0=00, DAT1=00 Aug 23 18:48:40 peetoo kernel: [ 1591.209296] i2c_adapter i2c-0: Transaction (pre): CNT=08, CMD=2c, ADD=5a, DAT0=00, DAT1=00 As soon as write sysfs. Dunno where to start, this is from adm9240 driver that works in 2.6.13-rc6-git12 but not -mm1 or -mm2, terminal lost, but able to log in on another terminal. -mm2 was okay until I wrote to sysfs. With -mm1 it failed on reading the sysfs area as well, so there's a little progress. top: top - 18:52:07 up 29 min, 2 users, load average: 0.99, 0.62, 0.26 Tasks: 50 total, 3 running, 47 sleeping, 0 stopped, 0 zombie Cpu(s): 0.3% us, 0.0% sy, 0.0% ni, 99.7% id, 0.0% wa, 0.0% hi, 0.0% si Mem:515360k total, 146504k used, 368856k free,15932k buffers Swap: 514000k total,0k used, 514000k free, 109296k cached Grant. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: use of uninitialized pointer in jffs_create()
On Tue, 23 August 2005 01:07:58 +0200, Adrian Bunk wrote: On Mon, Aug 22, 2005 at 12:45:59PM +0200, Jörn Engel wrote: On Sun, 21 August 2005 00:28:08 +0200, Jesper Juhl wrote: gcc kindly pointed me at jffs_create() with this warning : fs/jffs/inode-v23.c:1279: warning: `inode' might be used uninitialized in this function Real fix would be to finally remove that code. Except for the usual change this function in the whole kernel stuff, noone has touched it for ages. That's wrong, this -mm specific bug comes git-ocfs2.patch . Ack. If I wasn't this lazy, I'd still propose to completely remove jffs - it's been old and deprecated for a few years already. Jörn -- Public Domain - Free as in Beer General Public - Free as in Speech BSD License- Free as in Enterprise Shared Source - Free as in Work will make you... - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
mass tulip_stop_rxtx() failed, network stops
We are running almost 20 Fujitsu-Siemens Scenic machines, 2.6.8.1 kernel, equipped with a onboard card that uses a tulip module: 02:01.0 Ethernet controller: Linksys NC100 Network Everywhere Fast Ethernet 10/100 (rev 11) No problem with those. We are running four more machines like that, the only difference is the kernel they are running (2.6.11.4). On some of them, there are serious problems with a network, and they usually happen when the traffic is bigger than usual (i.e., some big software deployment to several workstations, remote backup, etc.). The syslog is then full of entries like that: Aug 21 04:04:30 SERVER-B-HS kernel: NETDEV WATCHDOG: eth0: transmit timed out Aug 21 04:04:30 SERVER-B-HS kernel: :00:06.0: tulip_stop_rxtx() failed and it's filling logs for hours; network doesn't work anymore, and someone has to restart the network or the machine itself. It doesn't always happen with a big traffic - sometimes you can fill the 100 Mbit link and do lots of reads from the disk, but nothing bad happens for hours. I saw some posts on this issue (2.6.10-rc3: tulip-driver: tulip_stop_rxtx() failed), but it seemed to me that it wasn't similar to my problems; I looked into 2.6.10 kernel changelog, but there were no descriptions of that problem, either. Any help appreciated, because rebooting machines which are 500 km away and are not responding is no fun :) -- Tomek http://wpkg.org - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: VIA Rhine ethernet driver bug (reprise...)
[CCing maintaner] On Monday 22 August 2005 20:29, Udo van den Heuvel wrote: Hello, It appears that the VIA Rhine chipset has some sort of bug which shows up in both the standard Linux VIA-Rhine driver and the Rhinefet driver that VIA itself provides. The difference is that the connection is dropped in case of the standard Linux driver for VIA Rhine but that the connection remains OK with the Rhinefet driver provided by VIA (http://www.viaarena.com/downloads/Source/rhinefet.tgz and other places on viaarena.com...). So VIA Rhinefet driver consumes more CPU but is also more stable. I wrote about this issue before: http://lkml.org/lkml/2005/8/7/82 http://lkml.org/lkml/2005/1/15/47 etc. I opened a bugzilla case: http://bugzilla.kernel.org/show_bug.cgi?id=5030 Who could find out why the standard Linux driver chokes and the Rhinefet driver doesn't? Who could fix this bug? My suggestion was, and still is: Since it happens less than once a day, why not just add a code to reset the NIC completely in this case, like it is typically done in tx_timeout handlers of many NICs, and forget about it? Do you see any problems in this approach? -- vda - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with MMC card reader
On Mon, Aug 22, 2005 at 11:16:46PM -0600, ScytheBlade1 wrote: I've enabled everything needed...the CF port works flawlessly. However, the SD slot does *not*. I've got about 5+ pages worth of dmesg output related to this (MMC is NOT debug enabled, and I still get a disturbing amount of output). Would here be the best place to post this, or would a different list be better? (Recommendations as to which list are welcomed :)). CONFIG_SCSI_MULTI_LUN=y This is probably the answer you needed... because, the reader is a single device, with multiple slots, and usb_storage driver uses SCSI infrastructure, so multiple slots map to multiple SCSI LUN's. Atleast, it worked for all my readers during the years :)) -- [EMAIL PROTECTED], IRC:[EMAIL PROTECTED], /bin/zsh. C|NK Linux moria 2.6.11 #1 Wed Mar 9 19:08:59 CET 2005 i686 11:23:27 up 27 days, 3:56, 4 users, load average: 0.11, 0.23, 0.29 We are Microsoft. First we'll reboot, and then asimilate you. pgpQuFuInhAl4.pgp Description: PGP signature This message has been 'sanitized'. This means that potentially dangerous content has been rewritten or removed. The following log describes which actions were taken. Sanitizer (start=1124789169): Forcing message to be multipart/mixed, to facilitate logging. Anomy 0.0.0 : Sanitizer.pm $Id: Sanitizer.pm,v 1.87 2004/05/07 17:42:12 bre Exp $
Re: IRQ problem with PCMCIA
Alan, The old code can be fixed, just I don't have the time or any desire to look at it again, still. The burn out from the last issues from 2001-2003, cost me some health problems over the stress. If I encounter these problems and become annoyed enough, I will fix it. However, if it is cheaper to buy working hardware, that is the route I will take. You (Alan), if anyone knows anything can be done in Linux, otherwise none of us would have ever put this much effort into its success. Cheers, Andre On Mon, 22 Aug 2005, Alan Cox wrote: On Llu, 2005-08-22 at 11:25 +0200, Bartlomiej Zolnierkiewicz wrote: CardBus IDE devices work just fine but there are still issues with hotplug support (work in progress). work in progress. Yes because I submitted working IDE cardbus hotplug support, and Mark Lord submitted a Delkin driver both of which worked months ago rather nicely and neither of which hit the Bartlomiej stone wall and never got in and are now stale patches. up ever getting those into the kernel. Please wait instead for the new SATA/ATA layer to develop hotplug support. This is just a FUD to discourage people from working on IDE drivers. Alan is doing this on purpose and doesn't really want to improve things. Its a realistic assessment based upon over ten years working on the Linux kernel. I do not believe you are capable of fixing the old IDE code. But don't take that personally I am sceptical than anyone can fix the old IDE code. Alan - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [2.6 patch] cris: extern inline - static inline
Ok, I've made a testcompile and the resulting image size is similar so the patch is good. Acked-by: Mikael Starvik [EMAIL PROTECTED] /Mikael -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Adrian Bunk Sent: Monday, August 22, 2005 1:55 AM To: Mikael Starvik Cc: dev-etrax; linux-kernel@vger.kernel.org Subject: [2.6 patch] cris: extern inline - static inline extern inline doesn't make much sense. Signed-off-by: Adrian Bunk [EMAIL PROTECTED] --- arch/cris/arch-v10/README.mm|6 +-- arch/cris/arch-v10/kernel/signal.c |2 - arch/cris/arch-v32/kernel/signal.c |2 - arch/cris/mm/ioremap.c |2 - include/asm-cris/arch-v10/byteorder.h |4 +- include/asm-cris/arch-v10/checksum.h|2 - include/asm-cris/arch-v10/delay.h |2 - include/asm-cris/arch-v10/ide.h |8 ++-- include/asm-cris/arch-v10/system.h |8 ++-- include/asm-cris/arch-v10/thread_info.h |2 - include/asm-cris/arch-v10/timex.h |2 - include/asm-cris/arch-v10/uaccess.h |4 +- include/asm-cris/arch-v32/bitops.h | 10 ++--- include/asm-cris/arch-v32/byteorder.h |4 +- include/asm-cris/arch-v32/checksum.h|2 - include/asm-cris/arch-v32/delay.h |2 - include/asm-cris/arch-v32/ide.h |4 +- include/asm-cris/arch-v32/io.h |6 +-- include/asm-cris/arch-v32/system.h |6 +-- include/asm-cris/arch-v32/thread_info.h |2 - include/asm-cris/arch-v32/timex.h |2 - include/asm-cris/arch-v32/uaccess.h |4 +- include/asm-cris/atomic.h | 22 ++-- include/asm-cris/bitops.h | 18 - include/asm-cris/checksum.h |8 ++-- include/asm-cris/current.h |2 - include/asm-cris/delay.h|2 - include/asm-cris/io.h |6 +-- include/asm-cris/irq.h |2 - include/asm-cris/pgalloc.h | 12 +++--- include/asm-cris/pgtable.h | 44 include/asm-cris/processor.h|4 +- include/asm-cris/semaphore-helper.h |8 ++-- include/asm-cris/semaphore.h| 14 +++ include/asm-cris/system.h |2 - include/asm-cris/timex.h|2 - include/asm-cris/tlbflush.h |4 +- include/asm-cris/uaccess.h | 24 ++--- include/asm-cris/unistd.h | 20 +- 39 files changed, 140 insertions(+), 140 deletions(-) --- linux-2.6.13-rc6-mm1-full/arch/cris/arch-v10/README.mm.old 2005-08-22 01:38:14.0 +0200 +++ linux-2.6.13-rc6-mm1-full/arch/cris/arch-v10/README.mm 2005-08-22 01:38:36.0 +0200 @@ -177,7 +177,7 @@ Given the top-level Page Directory, the offset in that directory is calculated using the upper 8 bits: -extern inline pgd_t * pgd_offset(struct mm_struct * mm, unsigned long address) +static inline pgd_t * pgd_offset(struct mm_struct * mm, unsigned long address) { return mm-pgd + (address PGDIR_SHIFT); } @@ -190,14 +190,14 @@ Since the Middle Directory does not exist, it is a unity mapping: -extern inline pmd_t * pmd_offset(pgd_t * dir, unsigned long address) +static inline pmd_t * pmd_offset(pgd_t * dir, unsigned long address) { return (pmd_t *) dir; } The Page Table provides the final lookup by using bits 13 to 23 as index: -extern inline pte_t * pte_offset(pmd_t * dir, unsigned long address) +static inline pte_t * pte_offset(pmd_t * dir, unsigned long address) { return (pte_t *) pmd_page(*dir) + ((address PAGE_SHIFT) (PTRS_PER_PTE - 1)); --- linux-2.6.13-rc6-mm1-full/arch/cris/arch-v10/kernel/signal.c.old 2005-08-22 01:38:45.0 +0200 +++ linux-2.6.13-rc6-mm1-full/arch/cris/arch-v10/kernel/signal.c 2005-08-22 01:38:54.0 +0200 @@ -476,7 +476,7 @@ * OK, we're invoking a handler */ -extern inline void +static inline void handle_signal(int canrestart, unsigned long sig, siginfo_t *info, struct k_sigaction *ka, sigset_t *oldset, struct pt_regs * regs) --- linux-2.6.13-rc6-mm1-full/arch/cris/arch-v32/kernel/signal.c.old 2005-08-22 01:39:03.0 +0200 +++ linux-2.6.13-rc6-mm1-full/arch/cris/arch-v32/kernel/signal.c 2005-08-22 01:39:09.0 +0200 @@ -513,7 +513,7 @@ } /* Invoke a singal handler to, well, handle the signal. */ -extern inline void +static inline void handle_signal(int canrestart, unsigned long sig, siginfo_t *info, struct k_sigaction *ka, sigset_t *oldset, struct pt_regs * regs) --- linux-2.6.13-rc6-mm1-full/arch/cris/mm/ioremap.c.old2005-08-22 01:39:18.0 +0200 +++ linux-2.6.13-rc6-mm1-full/arch/cris/mm/ioremap.c2005-08-22 01:39:23.0 +0200 @@ -16,7 +16,7 @@ #include asm/tlbflush.h #include asm/arch/memmap.h
Re: mass tulip_stop_rxtx() failed, network stops
On 8/23/05, Tomasz Chmielewski [EMAIL PROTECTED] wrote: We are running almost 20 Fujitsu-Siemens Scenic machines, 2.6.8.1 kernel, equipped with a onboard card that uses a tulip module: 02:01.0 Ethernet controller: Linksys NC100 Network Everywhere Fast Ethernet 10/100 (rev 11) No problem with those. We are running four more machines like that, the only difference is the kernel they are running (2.6.11.4). On some of them, there are serious problems with a network, and they usually happen when the traffic is bigger than usual (i.e., some big software deployment to several workstations, remote backup, etc.). The syslog is then full of entries like that: Aug 21 04:04:30 SERVER-B-HS kernel: NETDEV WATCHDOG: eth0: transmit timed out Aug 21 04:04:30 SERVER-B-HS kernel: :00:06.0: tulip_stop_rxtx() failed I am seeing thousands of tulip_stop_rxtx() failed messages as well with 2.6.11. No regular network failure though. See http://kerneltrap.org/mailarchive/1/message/110291/flat Cheers, Jerome - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
kernel module seg fault
Hi, I have written a kernel module and I can load (insmod) it without any error. But when i run my module it gets seg fault at interruptible_sleep_on_timeout(); I have used this function in the following way: DECLARE_WAIT_QUEUE_HEAD(wq); init_waitqueue_head(wq); interruptible_sleep_on_timeout(wq, 2); I am using redhat version 9.0 and kernel version 2.4.20-8. Could you please give some light on this issue? Manomugdha Biswas Send a rakhi to your brother, buy gifts and win attractive prizes. Log on to http://in.promos.yahoo.com/rakhi/index.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: mass tulip_stop_rxtx() failed, network stops
jerome lacoste schrieb: On 8/23/05, Tomasz Chmielewski [EMAIL PROTECTED] wrote: We are running almost 20 Fujitsu-Siemens Scenic machines, 2.6.8.1 kernel, equipped with a onboard card that uses a tulip module: 02:01.0 Ethernet controller: Linksys NC100 Network Everywhere Fast Ethernet 10/100 (rev 11) No problem with those. We are running four more machines like that, the only difference is the kernel they are running (2.6.11.4). On some of them, there are serious problems with a network, and they usually happen when the traffic is bigger than usual (i.e., some big software deployment to several workstations, remote backup, etc.). The syslog is then full of entries like that: Aug 21 04:04:30 SERVER-B-HS kernel: NETDEV WATCHDOG: eth0: transmit timed out Aug 21 04:04:30 SERVER-B-HS kernel: :00:06.0: tulip_stop_rxtx() failed I am seeing thousands of tulip_stop_rxtx() failed messages as well with 2.6.11. No regular network failure though. See http://kerneltrap.org/mailarchive/1/message/110291/flat Lucky you. Really no network problems, no increased ping responses? For me lots of pings are lost, and when this tulip_stop_rxtx() failed happens, the time for a ping to go back can be as big as 14 seconds in a 100 Mbit LAN. -- Tomek http://wpkg.org - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC: 2.6 patch] fs/super.c: unexport user_get_super
On Mon, Aug 22, 2005 at 06:20:56PM +0200, Adrian Bunk wrote: I didn't find any modular usage in the kernel. And there shouldn't be one either. This is really just for some syscalls, everything else should use get_super based on a struct block_device. If there's any caller using this wrongly in out of tree modules they can be switched to bdget + get_super trivially (fixing their code would be even better). Signed-off-by: Adrian Bunk [EMAIL PROTECTED] --- This patch was already sent on: - 30 May 2005 - 13 May 2005 - 1 May 2005 - 23 Apr 2005 --- linux-2.6.12-rc2-mm3-full/fs/super.c.old 2005-04-23 02:45:59.0 +0200 +++ linux-2.6.12-rc2-mm3-full/fs/super.c 2005-04-23 02:46:07.0 +0200 @@ -467,8 +467,6 @@ return NULL; } -EXPORT_SYMBOL(user_get_super); - asmlinkage long sys_ustat(unsigned dev, struct ustat __user * ubuf) { struct super_block *s; - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ ---end quoted text--- - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/2] external interrupts
On Mon, Aug 22, 2005 at 02:43:30PM -0700, Andrew Morton wrote: Laughter was not wholly unexpected, though I wasn't joking. I'm trying to be realistic about the lifetime of any given hardware, and IOC4 is several years old at this point. Couple that with a sincere desire to preserve application source compatability when (not if) new hardware appears, and an abstraction layer seemed to be a logical choice. I'm more than happy to discuss problems in the abstraction layer's interface and make appropriate changes -- I'm nothing if not obliging. Having an abstraction layer for a single client driver does seem a bit pointless. It would become more pointful if other client drivers were to pop up. The Octane port will hopefully soon support external inteerupts on the ioc3, so this does make sense. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux AIO status todo
On Tue, Aug 23, 2005 at 01:14:38PM +0530, Suparna Bhattacharya wrote: 2. No support for propagating IO completion events to user space threads using RT signals. User threads need to poll the completion queue using io_getevents. POSIX specifies that when an AIO request completes, a signal can be delivered to the application to indicate the completion of the IO. POSIX AIO needs to handle SIGEV_NONE, SIGEV_SIGNAL and SIGEV_THREAD notification. Obviously kernel shouldn't create threads for SIGEV_THREAD itself, as kernel shouldn't hardcode all the implementation details how a thread can be created. But it would be good if AIO signalling e.g. handled both SIGEV_SIGNAL and SIGEV_SIGNAL | SIGEV_THREAD_ID, with the same usage as e.g. timer_* syscalls. If kernel makes sure SI_ASYNCIO si_code is set in the notification signal siginfos, glibc could even use just one helper thread for timer_*/[al]io_* and maybe in the future other SIGEV_THREAD notification. Jakub - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: IRQ problem with PCMCIA
On Maw, 2005-08-23 at 09:49 +0200, Erik Mouw wrote: Is there any place where we can get your current patches? Which ones - the PATA IDE ones are in 2.6.11-ac, a subset in Fedora (other changes in the core IDE code make forward porting stuff for hotplug really tricky past 2.6.11). The SATA ones I can certainly put up if there is interest. I don't want to put them somewhere too available yet because this right now is stuff you only want to use under controlled circumstances for development until both they and the core SATA layer have some improvements. Alan - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: mass tulip_stop_rxtx() failed, network stops
jerome lacoste schrieb: On 8/23/05, Tomasz Chmielewski [EMAIL PROTECTED] wrote: (...) We are running four more machines like that, the only difference is the kernel they are running (2.6.11.4). On some of them, there are serious problems with a network, and they usually happen when the traffic is bigger than usual (i.e., some big software deployment to several workstations, remote backup, etc.). The syslog is then full of entries like that: Aug 21 04:04:30 SERVER-B-HS kernel: NETDEV WATCHDOG: eth0: transmit timed out Aug 21 04:04:30 SERVER-B-HS kernel: :00:06.0: tulip_stop_rxtx() failed I am seeing thousands of tulip_stop_rxtx() failed messages as well with 2.6.11. No regular network failure though. See http://kerneltrap.org/mailarchive/1/message/110291/flat This may have something to do with this patch, introduced with 2.6.10 (see the ChangeLog-2.6.10). It would explain why I had no problems on ~20 machines with 2.6.8.1 kernel, and I have this issue on the machines with 2.6.11.5 kernel. [PATCH] tulip: make tulip_stop_rxtx() wait for DMA to fully stop From: John W. Linville [EMAIL PROTECTED] tulip_stop_rxtx() doesn't wait for DMA to fully stop like the function call name implies. This was submitted through my employer -- I am not the original author of this patch. However, I passed it by Jeff Garizk and he expressed interest in having it upstream. -- Tomek http://wpkg.org - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Alsa-devel] [2.6 patch] sound/core/memalloc.c: fix PROC_FS=n compilation
At Tue, 23 Aug 2005 03:24:25 +0200, Adrian Bunk wrote: On Mon, Aug 22, 2005 at 02:41:07PM +0200, Takashi Iwai wrote: ... I think the below is simpler. Looks good. OK, it's now on ALSA tree. Thanks. Takashi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Add MCE resume under ia32
Hi! It's widely seen a MCE non-fatal error reported after resume. It seems MCE resume is lacked under ia32. This patch tries to fix the gap. Well, you patch seems like missing piece of puzzle, but: a) we probably want to do it for x86-64, too, and b) diff -puN arch/i386/power/cpu.c~mcheck_resume arch/i386/power/cpu.c --- linux-2.6.13-rc6/arch/i386/power/cpu.c~mcheck_resume 2005-08-23 09:32:13.054008584 +0800 +++ linux-2.6.13-rc6-root/arch/i386/power/cpu.c 2005-08-23 09:41:54.992540480 +0800 @@ -104,6 +104,8 @@ static void fix_processor_context(void) } +extern void mcheck_init(struct cpuinfo_x86 *c); + void __restore_processor_state(struct saved_context *ctxt) { /* this should go to some header file and most importantly @@ -138,6 +140,9 @@ void __restore_processor_state(struct sa fix_processor_context(); do_fpu_end(); mtrr_ap_init(); +#ifdef CONFIG_X86_MCE + mcheck_init(boot_cpu_data); +#endif } c) can't we register MCEs like some kind of system device so that this kind of hooks is not neccessary? Pavel -- if you have sharp zaurus hardware you don't need... you know my address - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: APIC version and 8-bit APIC IDs
On Mon, 22 Aug 2005, Martin Wilck wrote: It's a scalable system where multiple boards may be combined. Anyway, I see nothing in the specs that says you must start counting CPUs from zero. Well, Intel's Multiprocessor Specification mandates that (see section 3.6.1 and also the compliance list in Appendix C). I does not mandate local APIC IDs to be consecutive though. Maciej - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel module seg fault
On Tue, 23 Aug 2005, manomugdha biswas wrote: Hi, I have written a kernel module and I can load (insmod) it without any error. But when i run my module it gets seg fault at interruptible_sleep_on_timeout(); I have used this function in the following way: DECLARE_WAIT_QUEUE_HEAD(wq); init_waitqueue_head(wq); interruptible_sleep_on_timeout(wq, 2); I am using redhat version 9.0 and kernel version 2.4.20-8. Could you please give some light on this issue? Manomugdha Biswas seg fault?? You meen you get a kernel panic? Please show us what it says. Note you can't sleep with a spin-lock held. Cheers, Dick Johnson Penguin : Linux version 2.6.12.5 on an i686 machine (5537.79 BogoMips). Warning : 98.36% of all statistics are fiction. . I apologize for the following. I tried to kill it with the above dot : The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [EMAIL PROTECTED] - and destroy all copies of this information, including any attachments, without reading or disclosing them. Thank you. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
patch for compiling ppc without pmu
Hi, This patch seems to be required to compile 2.6.13-rc6 for ppc configured without PMU. Apologies if it is already known, I haven't found anything like this quickly. Signed-Off-By: Johannes Berg [EMAIL PROTECTED] --- linux-2.6.13-rc6.orig/arch/ppc/platforms/pmac_time.c2005-08-23 12:14:37.689485664 +0200 +++ linux-2.6.13-rc6/arch/ppc/platforms/pmac_time.c 2005-08-23 12:14:37.689485664 +0200 @@ -251,7 +251,7 @@ struct device_node *cpu; unsigned int freq, *fp; -#ifdef CONFIG_PM CONFIG_ADB_PMU +#if defined(CONFIG_PM) defined(CONFIG_ADB_PMU) pmu_register_sleep_notifier(time_sleep_notifier); #endif /* CONFIG_PM */ signature.asc Description: This is a digitally signed message part
Re: skge missing ifdefs.
Hi, On Tue, 23 Aug 2005, Al Viro wrote: As for your s/thread_info/stack/ - I don't believe it's doable in mainline right now. It's definitely separate from m68k merge and should not be mixed into it. Moreover, mandatory changes to every platform arch-specific code over basically cosmetic issue (renaming a field of task_struct) at this point are going to be gratitious PITA for every architecture with out-of-tree development. And m68k folks, of all people, should know what fun it is. No, I don't know it. Sometimes merging can be tricky, but then I check the original diff and apply it manually. What I'm planning involves no logical changes, so it would be an absolute no-brainer to merge. It's the logical changes that may even compile normally, that can be the a real PITA. When folks start using task_thread_info() in arch/* (i.e. by 2.6.1[45]) the size of that delta will go down big way and it will be less painful. Until then... Not a good idea. I already did the complete conversion (and I did it forward and backward to be sure the result is the same), so I dont see the problem to merge it in 2.6.13. The final removal of the thread_info field can happen in 2.6.14 and any missed changes in external trees are trivially fixable. bye, Roman - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: sched_yield() makes OpenLDAP slow
On Mon, 22 Aug 2005, Robert Hancock wrote: linux-os (Dick Johnson) wrote: I reported thet sched_yield() wasn't working (at least as expected) back in March of 2004. for(;;) sched_yield(); ... takes 100% CPU time as reported by `top`. It should take practically 0. Somebody said that this was because `top` was broken, others said that it was because I didn't know how to code. Nevertheless, the problem was not fixed, even after schedular changes were made for the current version. This is what I would expect if run on an otherwise idle machine. sched_yield just puts you at the back of the line for runnable processes, it doesn't magically cause you to go to sleep somehow. When a kernel build is occurring??? Plus `top` itself It damn well sleep while giving up the CPU. If it doesn't it's broken. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ Cheers, Dick Johnson Penguin : Linux version 2.6.12.5 on an i686 machine (5537.79 BogoMips). Warning : 98.36% of all statistics are fiction. . I apologize for the following. I tried to kill it with the above dot : The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [EMAIL PROTECTED] - and destroy all copies of this information, including any attachments, without reading or disclosing them. Thank you. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] New system call, unshare
On Mon, Aug 08, 2005 at 03:46:06PM +0100, Alan Cox wrote: On Llu, 2005-08-08 at 09:33 -0400, Janak Desai wrote: [PATCH 1/2] unshare system call: System Call handler function sys_unshare Given the complexity of the kernel code involved and the obscurity of the functionality why not just do another clone() in userspace to unshare the things you want to unshare and then _exit the parent ? Because you want to keep children? Because you don't want to deal with the implications for sessions/groups/etc.? FWIW, syscall makes sense. It is a valid primitive and the only reason to keep it out of clone() (i.e. not making it just another flag to clone()) is that clone() is already cluttered _and_ uses bad calling conventions for that stuff (I want to retain list rather than I want private list). - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] New system call, unshare
On Wed, Aug 10, 2005 at 04:08:31PM +0200, Florian Weimer wrote: * Janak Desai: With unshare, namespace setup can be done using PAM session management functions without patching individual commands. I don't think it's a good idea to use security-critical code well without its original specification. Clearly the current situation sucks, but this is mainly a lack of PAM functionality, IMHO. Eh? We are talking about a primitive that has far more uses than PAM. This is a missing piece of the stuff done by clone() and fork(): each task is a virtual machine with sharable components. We can get a copy of machine with arbitrary set of components replaced with private copies. That's what clone() and fork() do. The thing missing from that set is taking a component (VM, descriptors, etc.) of process itself and making it private. The same thing we do on fork(), but without creating a new process. FWIW, I'm OK with that. IIRC, Linus ACKed the concept some time ago. PAM is one obvious use, but there's are other situations where the lack of that primitive is inconvenient... - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fix send_sigqueue() vs thread exit race
On Mon, 2005-08-22 at 20:45 +0400, Oleg Nesterov wrote: Thomas Gleixner wrote: Ok, exit_itimers()-itimer_delete() called when the last thread exits or does exec. kernel/posix-timers.c:common_timer_del() calls del_timer_sync(), after that nobody can access this timer, so we don't need to lock timer-it_lock at all in this case. No lock - no deadlock. It still deadlocks: CPU 0 CPU 1 write_lock(tasklist_lock); __exit_signal() timer expires base-running_timer = timer send_group_sigqueue() read_lock(tasklist_lock(); exit_itimers() del_timer_sync(timer) waits for ever because waits for ever on tasklist_lock base-running_timer == timer I still think the last patch I sent is still necessary. But I know nothing about kernel/posix-cpu-timers.c, I doubt it will work for posix_cpu_timer_del(). I don't have time to study posix-cpu-timers now. However, I see that __exit_signal() calls posix_cpu_timers_exit_xxx(), so may be it can work? 380 int posix_cpu_timer_del(struct k_itimer *timer) 381 { 382 struct task_struct *p = timer-it.cpu.task; 383 384 if (timer-it.cpu.firing) 385 return TIMER_RETRY; 386 387 if (unlikely(p == NULL)) 388 return 0; 389 390 if (!list_empty(timer-it.cpu.entry)) { 391 read_lock(tasklist_lock); Surely, it should be impossible to happen when process exists, otherwise it would deadlock immediately, we did write_lock(tasklist). Thomas, do you know something about posix-cpu-timers.c? Not much. I look into this tglx - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.13-rc6] cpu_exclusive sched domains on partial nodes temp fix
* Paul Jackson [EMAIL PROTECTED] wrote: /* + * Hack to avoid 2.6.13 partial node dynamic sched domain bug. + * Require the 'cpu_exclusive' cpuset to include all (or none) + * of the CPUs on each node, or return w/o changing sched domains. + * Remove this hack when dynamic sched domains fixed. + */ + { + int i, j; + + for_each_cpu_mask(i, cur-cpus_allowed) { + for_each_cpu_mask(j, node_to_cpumask(cpu_to_node(i))) { + if (!cpu_isset(j, cur-cpus_allowed)) + return; + } + } + } + certainly looks acceptable from a scheduler POV. Acked-by: Ingo Molnar [EMAIL PROTECTED] Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2.6.13-rc6] cpu_exclusive sched domains on partial nodes temp fix
If Dinakar, Hawkes and Nick concur (and no one else complains too loud) then the following should go into 2.6.13, to avoid the potential kernel oops that Hawkes reported in Dinakar's feature to allow user control of dynamic sched domain placement using cpu_exclusive cpusets. This patch keeps the kernel/cpuset.c routine update_cpu_domains() from invoking the sched.c routine partition_sched_domains() if the cpuset in question doesn't fall on node boundaries. I have boot tested this on an SN2, and with the help of a couple of ad hoc printk's, determined that it does indeed avoid calling the partition_sched_domains() routine on partial nodes. I did not directly verify that this avoids setting up bogus sched domains or avoids the oops that Hawkes saw. Obviously, if the above named parties decide to take some other path, then this patch should be discarded. I submit this patch under the expectation that Hawkes and others fixes to support sched domains not on node boundaries will go into *-mm and 2.6.14. Do not include the following patch in *-mm or 2.6.14 versions which have the real sched domain fixes. This patch imposes a silent artificial constraint on which cpusets can be used to define dynamic sched domains. This patch should allow proceeding with this new feature in 2.6.13 for the configurations in which it is useful (node alligned sched domains) while avoiding trying to setup sched domains in the less useful cases that can cause the kernel corruption and oops. Signed-off-by: Paul Jackson [EMAIL PROTECTED] Index: linux-2.6.13-cpuset-mempolicy-migrate/kernel/cpuset.c === --- linux-2.6.13-cpuset-mempolicy-migrate.orig/kernel/cpuset.c +++ linux-2.6.13-cpuset-mempolicy-migrate/kernel/cpuset.c @@ -636,6 +636,23 @@ static void update_cpu_domains(struct cp return; /* +* Hack to avoid 2.6.13 partial node dynamic sched domain bug. +* Require the 'cpu_exclusive' cpuset to include all (or none) +* of the CPUs on each node, or return w/o changing sched domains. +* Remove this hack when dynamic sched domains fixed. +*/ + { + int i, j; + + for_each_cpu_mask(i, cur-cpus_allowed) { + for_each_cpu_mask(j, node_to_cpumask(cpu_to_node(i))) { + if (!cpu_isset(j, cur-cpus_allowed)) + return; + } + } + } + + /* * Get all cpus from parent's cpus_allowed not part of exclusive * children */ -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson [EMAIL PROTECTED] 1.650.933.1373 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fix send_sigqueue() vs thread exit race
On Mon, 2005-08-22 at 20:45 +0400, Oleg Nesterov wrote: But I know nothing about kernel/posix-cpu-timers.c, I doubt it will work for posix_cpu_timer_del(). I don't have time to study posix-cpu-timers now. However, I see that __exit_signal() calls posix_cpu_timers_exit_xxx(), so may be it can work? timer-it.cpu.task is set to NULL by posix_cpu_timers_exit(), so the code in posix_cpu_timer_del returns before accessing tasklist_lock. The exit functions do not take any locks, but it is not necessary there. posix_run_cpu_timers(p) is called with p=current() and we have interrupts disabled, so the timer interrupt can not run on this CPU. The current exiting process can not run at the same time on a different CPU, so no race and lockup possible here. tglx - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
usb oops in 2.6.13-rc6-mm2
Hi, usbcore: deregistering driver usb-storage usb 1-1: USB disconnect, address 3 Unable to handle kernel NULL pointer dereference at RIP: 803cf140{_spin_lock+0} PGD 1c303067 PUD 1c304067 PMD 0 Oops: 0002 [1] SMP CPU 0 Modules linked in: nls_iso8859_1 nls_cp437 vfat fat nls_base ide_cd cdrom Pid: 80, comm: khubd Not tainted 2.6.13-rc6-mm2 RIP: 0010:[803cf140] 803cf140{_spin_lock+0} RSP: 0018:81001fc75d80 EFLAGS: 00010296 RAX: 81001c08cdb0 RBX: 810019f5f8f8 RCX: 81001c4b14e8 RDX: 0070 RSI: 8040cfcc RDI: RBP: 810019f5f8a0 R08: R09: R10: 0001 R11: 8018ad27 R12: R13: 810001a23c20 R14: 810001a23c00 R15: 0100 FS: 2ade8b00() GS:80612880() knlGS:61ad4bb0 CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: CR3: 1c302000 CR4: 06e0 Process khubd (pid: 80, threadinfo 81001fc74000, task 8100019f4e80) Stack: 803cd130 80500aa0 810019f5f980 80500aa0 802a3263 810019f5f9e8 810019f5f8a0 810019f5f8a0 802a34f2 80500880 Call Trace:803cd130{klist_remove+21} 802a3263{__device_release_driver+75} 802a34f2{device_release_driver+39} 802a2db7{bus_remove_device+146} 802a1f75{device_del+55} 802a1fbc{device_unregister+9} 802ff51c{hub_thread+900} 80145e70{autoremove_wake_function+0} 802ff198{hub_thread+0} 80145a70{keventd_create_kthread+0} 80145c9e{kthread+203} 8012e3ae{schedule_tail+57} 8010e6ce{child_rip+8} 80145a70{keventd_create_kthread+0} 80145bd3{kthread+0} 8010e6c6{child_rip+0} Code: f0 fe 0f 79 09 f3 90 80 3f 00 7e f9 eb f2 c3 f0 ff 0f 8b 07 RIP 803cf140{_spin_lock+0} RSP 81001fc75d80 CR2: Just got this oops removing a usb-storage managed usb device. usb-storage had been manually removed (as you can see from the kernel message), a few seconds later I removed power from the device and the oopsed happened right then. -- Jens Axboe - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
new qla2xxx driver breaks SAN setup with 2 controllers
hello, we are experiencing problems with the new qlogic driver in 2.6.12 on a set of servers with qla2310 HBAs. The problem is as follows: The Infotrend storage array we are using has two controllers, each of them has two virtual discs with a couple of partitions exported as shared storage. The controllers are linked inside of the storage box, each controller has one qlogic fabric switch attached, and half of the servers are connected to the lefthand switch, the other half is connected to the righthand switch. Now, with the qlogic driver in 2.6.11.12, we can access all shares on both controllers from every server, while the new driver allows only access to the respective controller where the switch is attached to directly, thus depriving the servers of half of it's shared storage devices. Example: on server s05, we have a boot device (lun 3 on primary controller), and 2 shared storages (lun 9 on primary, lun 10 on secondary controller). With 2.6.11.12, this looks as follows: s05:~# cat /proc/scsi/scsi Attached devices: Host: scsi0 Channel: 00 Id: 00 Lun: 03 Vendor: IFT Model: A16F-R1211 Rev: 334B Type: Direct-AccessANSI SCSI revision: 03 Host: scsi0 Channel: 00 Id: 00 Lun: 09 Vendor: IFT Model: A16F-R1211 Rev: 334B Type: Direct-AccessANSI SCSI revision: 03 Host: scsi0 Channel: 00 Id: 01 Lun: 10 Vendor: IFT Model: A16F-R1211 Rev: 334B Type: Direct-AccessANSI SCSI revision: 03 and the driver sees everything: s05:~# cat /proc/scsi/qla2xxx/0 QLogic PCI to Fibre Channel Host Adapter for QLA2310: Firmware version 3.03.08 IPX, Driver version 8.00.02b4-k ISP: ISP2300, Serial# R74545 Request Queue = 0xcf94, Response Queue = 0xcf98 Request Queue count = 2048, Response Queue count = 512 Total number of active commands = 0 Total number of interrupts = 1117762 Device queue depth = 0x20 Number of free request entries = 964 Number of mailbox timeouts = 0 Number of ISP aborts = 0 Number of loop resyncs = 0 Number of retries for empty slots = 0 Number of reqs in pending_q= 0, retry_q= 0, done_q= 0, scsi_retry_q= 0 Host adapter:loop state = READY, flags = 0x1a03 Dpc flags = 0x0 MBX flags = 0x0 Link down Timeout = 030 Port down retry = 030 Login retry count = 030 Commands retried with dropped frame(s) = 0 Product ID = 4953 5020 2020 0001 SCSI Device Information: scsi-qla0-adapter-node=20e08b1bd113; scsi-qla0-adapter-port=21e08b1bd113; scsi-qla0-target-0=21d02382; scsi-qla0-target-1=21d02362; SCSI LUN Information: (Id:Lun) * - indicates lun is not registered with the OS. ( 0: 0): Total reqs 2, Pending reqs 0, flags 0x0*, 0:0:81 00 ( 0: 3): Total reqs 470693, Pending reqs 0, flags 0x0, 0:0:81 00 ( 0: 9): Total reqs 227717, Pending reqs 0, flags 0x0, 0:0:81 00 ( 0:11): Total reqs 0, Pending reqs 0, flags 0x0*, 0:0:81 00 ( 0:13): Total reqs 0, Pending reqs 0, flags 0x0*, 0:0:81 00 ( 1: 0): Total reqs 2, Pending reqs 0, flags 0x0*, 0:0:82 00 ( 1:10): Total reqs 12, Pending reqs 0, flags 0x0, 0:0:82 00 ( 1:12): Total reqs 0, Pending reqs 0, flags 0x0*, 0:0:82 00 ( 1:14): Total reqs 0, Pending reqs 0, flags 0x0*, 0:0:82 00 while on 2.6.12.5 and 2.6.13-rc6 it looks like this: sm05:~# scsiadd -a 0 0 0 9 Attached devices: Host: scsi0 Channel: 00 Id: 00 Lun: 03 Vendor: IFT Model: A16F-R1211 Rev: 334B Type: Direct-AccessANSI SCSI revision: 03 Host: scsi0 Channel: 00 Id: 00 Lun: 09 Vendor: IFT Model: A16F-R1211 Rev: 334B Type: Direct-AccessANSI SCSI revision: 03 sm05:~# scsiadd -a 0 0 1 10 Attached devices: Host: scsi0 Channel: 00 Id: 00 Lun: 03 Vendor: IFT Model: A16F-R1211 Rev: 334B Type: Direct-AccessANSI SCSI revision: 03 Host: scsi0 Channel: 00 Id: 00 Lun: 09 Vendor: IFT Model: A16F-R1211 Rev: 334B Type: Direct-AccessANSI SCSI revision: 03 unfortunately, the proc interface was removed: s05:/sys/devices/pci:00/:00:02.0/:01:00.0/:02:02.0/host0# find . . ./rport-0:0-1 ./rport-0:0-1/power ./rport-0:0-1/power/state ./rport-0:0-0 ./rport-0:0-0/target0:0:0 ./rport-0:0-0/target0:0:0/0:0:0:9 ./rport-0:0-0/target0:0:0/0:0:0:9/ioerr_cnt ./rport-0:0-0/target0:0:0/0:0:0:9/iodone_cnt ./rport-0:0-0/target0:0:0/0:0:0:9/iorequest_cnt ./rport-0:0-0/target0:0:0/0:0:0:9/iocounterbits ./rport-0:0-0/target0:0:0/0:0:0:9/timeout ./rport-0:0-0/target0:0:0/0:0:0:9/state ./rport-0:0-0/target0:0:0/0:0:0:9/delete ./rport-0:0-0/target0:0:0/0:0:0:9/rescan ./rport-0:0-0/target0:0:0/0:0:0:9/rev ./rport-0:0-0/target0:0:0/0:0:0:9/model ./rport-0:0-0/target0:0:0/0:0:0:9/vendor ./rport-0:0-0/target0:0:0/0:0:0:9/scsi_level ./rport-0:0-0/target0:0:0/0:0:0:9/type ./rport-0:0-0/target0:0:0/0:0:0:9/queue_type ./rport-0:0-0/target0:0:0/0:0:0:9/queue_depth ./rport-0:0-0/target0:0:0/0:0:0:9/device_blocked
Re: 2.6.13-rc6-mm2
Hi, On 23/08/2005 4:30 p.m., Andrew Morton wrote: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc6/2.6.13-rc6-mm2/ - Various updates. Nothing terribly noteworthy. Yup, seems to be generally good... Noticed this in the log earlier tonight: Aug 23 19:44:51 tornado kernel: hub 5-0:1.0: port 1 disabled by hub (EMI?), re-enabling... Aug 23 19:44:51 tornado kernel: usb 5-1: USB disconnect, address 2 Aug 23 19:44:51 tornado kernel: drivers/usb/class/usblp.c: usblp0: removed Aug 23 19:44:51 tornado kernel: Unable to handle kernel NULL pointer dereference at virtual address 0004 Aug 23 19:44:51 tornado kernel: printing eip: Aug 23 19:44:51 tornado kernel: c01ccef2 Aug 23 19:44:51 tornado kernel: *pde = Aug 23 19:44:51 tornado kernel: Oops: [#1] Aug 23 19:44:51 tornado kernel: SMP Aug 23 19:44:51 tornado kernel: last sysfs file: /devices/pci:00/:00:1f.3/i2c-0/name Aug 23 19:44:51 tornado kernel: Modules linked in: nfsd exportfs lockd eeprom sunrpc ipv6 iptable_filter binfmt_misc reiser4 zlib_de flate zlib_inflate dm_mod video thermal processor fan button ac tpm_nsc i2c_i801 sky2 e100 sr_mod Aug 23 19:44:51 tornado kernel: CPU:1 Aug 23 19:44:51 tornado kernel: EIP:0060:[c01ccef2]Not tainted VLI Aug 23 19:44:51 tornado kernel: EFLAGS: 00010286 (2.6.13-rc6-mm2) Aug 23 19:44:51 tornado kernel: EIP is at _raw_spin_lock+0x7/0x73 Aug 23 19:44:51 tornado kernel: eax: ebx: ecx: c1a60658 edx: c1a63e24 Aug 23 19:44:51 tornado kernel: esi: edi: c0382400 ebp: f7c55e98 esp: f7c55e90 Aug 23 19:44:51 tornado kernel: ds: 007b es: 007b ss: 0068 Aug 23 19:44:51 tornado kernel: Process khubd (pid: 109, threadinfo=f7c54000 task=c192b030) Aug 23 19:44:51 tornado kernel: Stack: f7c58a8c f7c55ea0 c0312219 f7c55eb0 c030feb7 f7c58ae8 f7c58a48 Aug 23 19:44:51 tornado kernel:f7c55ec4 c0217e73 f7c58a48 f7d134ec 0040 f7c55ed0 c0217ec0 f7c58a48 Aug 23 19:44:51 tornado kernel:f7c55edc c0217814 f7c58a48 f7c55eec c0216ad2 f7c58a48 f7c58a14 f7c55ef8 Aug 23 19:44:51 tornado kernel: Call Trace: Aug 23 19:44:51 tornado kernel: [c01039c3] show_stack+0x94/0xca Aug 23 19:44:51 tornado kernel: [c0103b6c] show_registers+0x15a/0x1ea Aug 23 19:44:51 tornado kernel: [c0103d8a] die+0x108/0x183 Aug 23 19:44:51 tornado kernel: [c031295a] do_page_fault+0x1ea/0x63d Aug 23 19:44:51 tornado kernel: [c0103693] error_code+0x4f/0x54 Aug 23 19:44:51 tornado kernel: [c0312219] _spin_lock+0x8/0xa Aug 23 19:44:51 tornado kernel: [c030feb7] klist_remove+0x10/0x2c Aug 23 19:44:51 tornado kernel: [c0217e73] __device_release_driver+0x41/0x65 Aug 23 19:44:51 tornado kernel: [c0217ec0] device_release_driver+0x29/0x39 Aug 23 19:44:51 tornado kernel: [c0217814] bus_remove_device+0x52/0x60 Aug 23 19:44:51 tornado kernel: [c0216ad2] device_del+0x2e/0x5d Aug 23 19:44:51 tornado kernel: [c0216b0c] device_unregister+0xb/0x15 Aug 23 19:44:51 tornado kernel: [c0275d67] usb_disconnect+0x115/0x15c Aug 23 19:44:51 tornado kernel: [c0276b85] hub_port_connect_change+0x54/0x399 Aug 23 19:44:51 tornado kernel: [c027713e] hub_events+0x274/0x3b2 Aug 23 19:44:51 tornado kernel: [c0277296] hub_thread+0x1a/0xdf Aug 23 19:44:51 tornado kernel: [c012fba7] kthread+0x99/0x9d Aug 23 19:44:51 tornado kernel: [c01010b5] kernel_thread_helper+0x5/0xb Aug 23 19:44:51 tornado kernel: Code: 00 00 00 8b 0d a8 62 36 c0 e9 61 ff ff ff f3 90 31 c0 86 07 84 c0 0f 8e 79 ff ff ff 83 c4 18 5 b 5e 5f 5d c3 55 89 e5 56 53 89 c3 81 78 04 ad 4e ad de 75 2d be 00 e0 ff ff 21 e6 8b 06 39 43 0c reuben - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: return ENOBUFS instead of ENOMEM in generic_file_buffered_write
On Tue, Aug 23, 2005 at 11:46:33AM +0300, Pekka J Enberg wrote: As noticed by Dmitry Torokhov, write() can not return ENOMEM: http://www.opengroup.org/onlinepubs/95399/functions/write.html Therefore fixup generic_file_buffered_write() in mm/filemap.c (pointed out by Nathan Scott). We had this discussion before, for EACCESS then. We've always been returning more errnos than SuS mentioned and Linus declared it's fine. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: sched_yield() makes OpenLDAP slow
On Tuesday 23 August 2005 14:17, linux-os \(Dick Johnson\) wrote: On Mon, 22 Aug 2005, Robert Hancock wrote: linux-os (Dick Johnson) wrote: I reported thet sched_yield() wasn't working (at least as expected) back in March of 2004. for(;;) sched_yield(); ... takes 100% CPU time as reported by `top`. It should take practically 0. Somebody said that this was because `top` was broken, others said that it was because I didn't know how to code. Nevertheless, the problem was not fixed, even after schedular changes were made for the current version. This is what I would expect if run on an otherwise idle machine. sched_yield just puts you at the back of the line for runnable processes, it doesn't magically cause you to go to sleep somehow. When a kernel build is occurring??? Plus `top` itself It damn well sleep while giving up the CPU. If it doesn't it's broken. top doesn't run all the time: # strace -o top.strace -tt top 14:52:19.407958 write(1, 758 root 16 0 104 2..., 79) = 79 14:52:19.408318 write(1, 759 root 16 0 100 1..., 79) = 79 14:52:19.408659 write(1, 760 root 16 0 100 1..., 79) = 79 14:52:19.409001 write(1, 761 root 18 0 2604 39..., 74) = 74 14:52:19.409342 write(1, 763 daemon17 0 108 1..., 78) = 78 14:52:19.409672 write(1, 773 root 16 0 104 2..., 79) = 79 14:52:19.410010 write(1, 774 root 16 0 104 2..., 79) = 79 14:52:19.410362 write(1, 775 root 16 0 100 1..., 79) = 79 14:52:19.410692 write(1, 776 root 16 0 104 2..., 79) = 79 14:52:19.411136 write(1, 777 daemon17 0 108 1..., 86) = 86 14:52:19.411505 select(1, [0], NULL, NULL, {5, 0}) = 0 (Timeout) hrrr. ps... 14:52:24.411744 time([1124797944]) = 1124797944 14:52:24.411883 lseek(4, 0, SEEK_SET) = 0 14:52:24.411957 read(4, 24822.01 18801.28\n, 1023) = 18 14:52:24.412082 access(/var/run/utmpx, F_OK) = -1 ENOENT (No such file or directory) 14:52:24.412224 open(/var/run/utmp, O_RDWR) = 8 14:52:24.412328 fcntl64(8, F_GETFD) = 0 14:52:24.412399 fcntl64(8, F_SETFD, FD_CLOEXEC) = 0 14:52:24.412467 _llseek(8, 0, [0], SEEK_SET) = 0 14:52:24.412556 alarm(0)= 0 14:52:24.412643 rt_sigaction(SIGALRM, {0x4015a57c, [], SA_RESTORER, 0x40094ae8}, {SIG_DFL}, 8) = 0 14:52:24.412747 alarm(1)= 0 However, kernel compile shouldn't. I suggest stracing with -tt for(;;) yield(); test proggy with and without kernel compile in parallel, and comparing the output... Hmm... actually, knowing that you will argue to death instead... # cat t.c #include sched.h int main() { for(;;) sched_yield(); return 0; } # gcc t.c # strace -tt ./a.out ... 15:03:41.211324 sched_yield() = 0 15:03:41.211673 sched_yield() = 0 15:03:41.212034 sched_yield() = 0 15:03:41.212400 sched_yield() = 0 15:03:41.212749 sched_yield() = 0 15:03:41.213126 sched_yield() = 0 15:03:41.213486 sched_yield() = 0 15:03:41.213835 sched_yield() = 0 15:03:41.214220 sched_yield() = 0 15:03:41.214577 sched_yield() = 0 15:03:41.214939 sched_yield() = 0 I start while true; do true; done on another console... 15:03:43.314645 sched_yield() = 0 15:03:43.847644 sched_yield() = 0 15:03:43.954635 sched_yield() = 0 15:03:44.063798 sched_yield() = 0 15:03:44.171596 sched_yield() = 0 15:03:44.282624 sched_yield() = 0 15:03:44.391632 sched_yield() = 0 15:03:44.498609 sched_yield() = 0 15:03:44.605584 sched_yield() = 0 15:03:44.712538 sched_yield() = 0 15:03:44.819557 sched_yield() = 0 15:03:44.928594 sched_yield() = 0 15:03:45.040603 sched_yield() = 0 15:03:45.148545 sched_yield() = 0 15:03:45.259311 sched_yield() = 0 15:03:45.368563 sched_yield() = 0 15:03:45.476482 sched_yield() = 0 15:03:45.583568 sched_yield() = 0 15:03:45.690491 sched_yield() = 0 15:03:45.797512 sched_yield() = 0 15:03:45.906534 sched_yield() = 0 15:03:46.013545 sched_yield() = 0 15:03:46.120505 sched_yield() = 0 Ctrl-C # uname -a Linux firebird 2.6.12-r4 #1 SMP Sun Jul 17 13:51:47 EEST 2005 i686 unknown unknown GNU/Linux -- vda - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)
Hi, On Mon, 22 Aug 2005, john stultz wrote: The reason why we calculate the interval_length in the continuous timesource case is because we are not assuming anything about the frequency that the timekeeping_periodic_hook() is called. The problem with your patch is that it doesn't allow making such assumptions. Anyway, it's rather simple, if you want to update the time asynchronously: cycle_offset = get_cycles() - last_update; while (cycle_offset = update_cycles) { cycle_offset -= update_cycles; last_update += update_cycles; // at init: system_update = update_cycles * mult; system_time += system_update; xtime += [tick_nsec, time_adj]; } error = system_time - (xtime.tv_nsec shift); if (abs(error) update_cycles/2) { mult_adj = (error +- update_cycles/2) / update_cycles; mult += mult_adj; system_update += mult_adj * update_cycles; system_time -= mult_adj * cycle_offset; error -= mult_adj * cycle_offset; } if (xtime.tv_nsec + (error shift) NSEC_PER_SEC) { system_time -= NSEC_PER_SEC shift; second_overflow(); } Since we usually don't have to adjust for the error all at once, it should be possible to precalculate some of it in adjtimex/second_overflow and turn mult_adj into a mult_adj_shift. I didn't really check the math here in detail, so there should be enough errors left :), but I hope it's enough to show the idea (especially how to do it without mult/divide). There are now variations of this possible, the initial cycle_offset can be constant, this happens if it's regularly called from an interrupt (and it's sufficient for UP systems). We could also completely ignore the error, so that the core calculation of the above results in the familiar: xtime += [tick_nsec, time_adj]; if (xtime.tv_nsec NSEC_PER_SEC) second_overflow(); Another variation would be useful for ppc64 (or maybe any 64bit arch, but ppc64 has already the matching gettimeofday). In this case we don't use a timespec based xtime and don't scale it to ns, but use 64bit values instead scaled to seconds. The last one may become a bit of a challenge to keep as much as possible code common without abusing the preprocessor too much. In any case some functions will differ completely anyway, especially gettimeofday will be optimized differently depending on the arch/clock requirements, OTOH introducing a common gettimeofday (that would even require a 64bit divide) would be a huge mistake. bye, Roman - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: kernel module seg fault
Hi Biswas, You need to post the complete kernel dump message and body of your source code. -Bunnan -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of manomugdha biswas Sent: Tuesday, August 23, 2005 3:13 PM To: linux-kernel@vger.kernel.org Subject: kernel module seg fault Hi, I have written a kernel module and I can load (insmod) it without any error. But when i run my module it gets seg fault at interruptible_sleep_on_timeout(); I have used this function in the following way: DECLARE_WAIT_QUEUE_HEAD(wq); init_waitqueue_head(wq); interruptible_sleep_on_timeout(wq, 2); I am using redhat version 9.0 and kernel version 2.4.20-8. Could you please give some light on this issue? Manomugdha Biswas Send a rakhi to your brother, buy gifts and win attractive prizes. Log on to http://in.promos.yahoo.com/rakhi/index.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] blk queue io tracing support
Hi, This is a little something I have played with. It allows you to see exactly what is going on in the block layer for a given queue. Currently it can logs request queueing and building, dispatches, requeues, and completions. I've uploaded a little silly app to do dumps here: http://www.kernel.org/pub/linux/kernel/people/axboe/tools/blktrace.c Sample output looks like this: wiggum:~ # ./blktrace /dev/sda relay name: /relay/sda0 0 3765 Q R 192-200 5 3765 G R 13 3765 M R [200-208] 15 3765 M R [208-216] 17 3765 M R [216-224] 18 3765 M R [224-232] 19 3765 M R [232-240] 20 3765 M R [240-248] 21 3765 M R [248-256] 154 3765 M R [256-264] 156 3765 M R [264-272] 157 3765 M R [272-280] 159 3765 M R [280-288] 160 3765 M R [288-296] 161 3765 M R [296-304] 162 3765 M R [304-312] 163 3765 M R [312-320] 164 3765 M R [320-328] 170 3765 M R [328-336] 171 3765 M R [336-344] 172 3765 M R [344-352] 173 3765 M R [352-360] 174 3765 M R [360-368] 175 3765 M R [368-376] 177 3765 M R [376-384] 178 3765 M R [384-392] 179 3765 Q R 392-400 180 3765 G R 181 3765 M R [400-408] 182 3765 M R [408-416] 183 3765 M R [416-424] 184 3765 M R [424-432] 185 3765 M R [432-440] 186 3765 M R [440-448] 187 3765 M R [448-456] 189 3765 M R [456-464] 190 3765 M R [464-472] 191 3765 M R [472-480] 193 3765 M R [480-488] 194 3765 M R [488-496] 196 3765 M R [496-504] 197 3765 M R [504-512] 228 3765 D R 192-392 245 3765 D R 392-512 14049 0 C R 192-392 [0] 14067 0 D R 392-512 14807 0 C R 392-512 [0] Reads: Queued: 2, 160KiB Completed:2, 160KiB Merges: 38 Writes: Queued: 0,0KiB Completed:0,0KiB Merges: 0 Events: 47 Missed events: 0 This is a log of a dd if=/dev/sda of=/dev/null bs=64k count=2 and it shows queueing (Q) and allocation (G) of two requests, along with the merges (M) that happens there. Finally you see dispatch (D) and completion (C) of them as well. When sigint is received, blktrace dumps stats of the current run. It will work for scsi commands as well, so you can see what is going on when cdrecord is talking to the device (the cdb is dumped, not the data). The final integer printed in [] after a completion is the error, 0 for correct completion. You can register interest in various events, see blktrace.c (grep for buts and BLKSTARTTRACE). Patch is against 2.6.13-rc6-mm2. I'm attaching a relayfs update from Tom Zanussi as well, which is required to handle sub-buffer wrapping correctly. You need to apply both patches to play with this - and make sure to enable CONFIG_BLK_DEV_IO_TRACE in your .config, of course. And blktrace.c relies on relayfs being mounted on /relay, add something ala none /relay relayfsdefaults 0 0 to your /etc/fstab to accomplish that (or do it manually, only mentioning it for completeness). -- Jens Axboe diff -urpN -X /home/axboe/cdrom/exclude /opt/kernel/linux-2.6.13-rc6-mm2/drivers/block/blktrace.c linux-2.6.13-rc6-mm2/drivers/block/blktrace.c --- /opt/kernel/linux-2.6.13-rc6-mm2/drivers/block/blktrace.c 1970-01-01 01:00:00.0 +0100 +++ linux-2.6.13-rc6-mm2/drivers/block/blktrace.c 2005-08-23 13:34:17.0 +0200 @@ -0,0 +1,119 @@ +#include linux/config.h +#include linux/kernel.h +#include linux/blkdev.h +#include linux/blktrace.h +#include asm/uaccess.h + +void __blk_add_trace(struct blk_trace *bt, sector_t sector, int bytes, +int rw, u32 what, int error, int pdu_len, char *pdu_data) +{ + struct blk_io_trace t; + unsigned long flags; + + if (rw == WRITE) + what |= BLK_TC_ACT(BLK_TC_WRITE); + else + what |= BLK_TC_ACT(BLK_TC_READ); + + if (((bt-act_mask BLK_TC_SHIFT) what) == 0) + return; + + t.magic = BLK_IO_TRACE_MAGIC | BLK_IO_TRACE_VERSION; + t.sequence = atomic_add_return(1, bt-sequence); + t.time = sched_clock() / 1000; + t.sector= sector; + t.bytes = bytes; + t.action= what; + t.pid = current-pid; + t.error = error; + t.pdu_len = pdu_len; + + local_irq_save(flags); + __relay_write(bt-rchan, t, sizeof(t)); + if (pdu_len) + __relay_write(bt-rchan, pdu_data, pdu_len); + local_irq_restore(flags); +} + +int blk_stop_trace(struct block_device *bdev) +{ + request_queue_t *q =
Re: 2.6.13-rc6-rt9
* Steven Rostedt [EMAIL PROTECTED] wrote: Ingo, can't you get rt.c to be more confusing. I mean it is too simple. We need to add a few more underscores here and there :-) Seriously, that rt.c is mind boggling. It was nice before, now it is just screaming for a cleanup (come now, do we really need the four underscores?). Same with latency.c. i agree that it's ugly, but some of that ugliness is to achieve the 7-instructions fail-through codepath for the common acquire (and release) codepath: c03a5320 __down_mutex: c03a5320: 89 c1 mov%eax,%ecx c03a5322: 8b 15 08 76 3a c0 mov0xc03a7608,%edx c03a5328: 31 c0 xor%eax,%eax c03a532a: 0f b1 51 14 cmpxchg %edx,0x14(%ecx) c03a532e: 85 c0 test %eax,%eax c03a5330: 75 01 jnec03a5333 __down_mutex+0x13 c03a5332: c3 ret that's how much it takes to acquire an RT lock, and i worked hard to get there. As long as the fastpath is kept this tight, feel free to do cleanups. But i really want to avoid having to write mutex_down/up in assembly for 24 architectures ... Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13-rc6-rt9
On Tue, 2005-08-23 at 14:36 +0200, Ingo Molnar wrote: * Steven Rostedt [EMAIL PROTECTED] wrote: Ingo, can't you get rt.c to be more confusing. I mean it is too simple. We need to add a few more underscores here and there :-) Seriously, that rt.c is mind boggling. It was nice before, now it is just screaming for a cleanup (come now, do we really need the four underscores?). Same with latency.c. i agree that it's ugly, but some of that ugliness is to achieve the 7-instructions fail-through codepath for the common acquire (and release) codepath: c03a5320 __down_mutex: c03a5320: 89 c1 mov%eax,%ecx c03a5322: 8b 15 08 76 3a c0 mov0xc03a7608,%edx c03a5328: 31 c0 xor%eax,%eax c03a532a: 0f b1 51 14 cmpxchg %edx,0x14(%ecx) c03a532e: 85 c0 test %eax,%eax c03a5330: 75 01 jnec03a5333 __down_mutex+0x13 c03a5332: c3 ret Impressive! that's how much it takes to acquire an RT lock, and i worked hard to get there. As long as the fastpath is kept this tight, feel free to do cleanups. But i really want to avoid having to write mutex_down/up in assembly for 24 architectures ... Warning! I'm hacking hard to get rid of the global pi_lock, and I'm not worrying now about efficiency. I figure that if I can get it to work, then we can speed it up afterwards. Since it's complex enough keeping all the locks straight, I just want it to work without deadlocking. Once I get it to work, I'll let you figure out how get it back down to 7-instructions :-) -- Steve - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] suspend: update warnings
Hi! + * If you have unsupported (*) devices using DMA, you may have some + * problems. If your disk driver does not support suspend... (IDE does), + * it may cause some problems, too. If you change kernel command line + * between suspend and resume, it may do something wrong. If you change + * your hardware while system is suspended... well, it was not good idea; + * but it wil probably only crash. The most common driver issues I see involve: - USB being built in or as modules that are still loaded while suspending (getting better, but not there yet) - DRI being used in X where the drivers don't properly support suspend/resume (NVidia esp) - Firewire - CPU Freq (improving too) It might be good to mention these areas too. Well, right; but those 'only' cause system to crash during suspend. I was talking about really dangerous stuff. Both usb and cpufreq seems to work okay here. I've added FAQ entry at the end: Q: What information is usefull for debugging suspend-to-disk problems? A: Well, last messages on the screen are always useful. If something is broken, it is usually some kernel driver, therefore trying with as little as possible modules loaded helps a lot. I also prefer people to suspend from console, preferably without X running. Booting with init=/bin/bash, then swapon and starting suspend sequence manually usually does the trick. Then it is good idea to try with latest vanilla kernel. Known problematic modules are; be sure to unload them before suspend: - DRI being used in X where the drivers don't properly support suspend/resume (NVidia esp) - Firewire - SCSI Perhaps the 'changing your hardware' could mention that replacing faulty hardware may be safe. I do not want to encourage people to do that. Yep, its probably safe, no, I do not want them to know. Pavel -- if you have sharp zaurus hardware you don't need... you know my address - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13-rc6-mm2 (hangs on non-SMP x86-64 and oopses)
On Tuesday, 23 of August 2005 06:30, Andrew Morton wrote: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc6/2.6.13-rc6-mm2/ - Various updates. Nothing terribly noteworthy. It hangs solig during boot (after starting kjournald) on Asus L5D (non-SMP x86-64), which is caused by this patch: 8250-serial-console-locking-bug-spelling-fix.patch (from binary search). If this patch is reverted, it oopses like in the following trace. At the same time it works fine on an SMP box (dual-core Athlon 64). Greetings, Rafael ACPI: PCI Interrupt Link [LUS2] enabled at IRQ 5 PCI: setting IRQ 5 as level-triggered ACPI: PCI Interrupt :00:02.2[C] - Link [LUS2] - GSI 5 (level, low) - IRQ 5 PCI: Setting latency timer of device :00:02.2 to 64 ehci_hcd :00:02.2: EHCI Host Controller ehci_hcd :00:02.2: debug port 1 ehci_hcd :00:02.2: new USB bus registered, assigned bus number 3 ehci_hcd :00:02.2: irq 5, io mem 0xfebfdc00 PCI: cache line size of 64 is not supported by device :00:02.2 ehci_hcd :00:02.2: park 0 ehci_hcd :00:02.2: USB 2.0 initialized, EHCI 1.00, driver 10 Dec 2004 hub 3-0:1.0: USB hub found usb 2-2: string descriptor 0 read error: -110 hub 3-0:1.0: 6 ports detected usb 2-2: string descriptor 0 read error: -110 usb 2-2: can't set config #1, error -110 Unable to handle kernel NULL pointer dereference at 0004 RIP: 8024373b{_raw_spin_lock+27} PGD 2ca73067 PUD 2ca46067 PMD 0 Oops: [1] PREEMPT CPU 0 Modules linked in: ehci_hcd ohci_hcd sk98lin evdev joydev sg st sr_mod sd_mod scsi_mod ide_cd cdrom dm_mod parport_pc lp parport Pid: 108, comm: khubd Not tainted 2.6.13-rc6-mm2 RIP: 0010:[8024373b] 8024373b{_raw_spin_lock+27} RSP: :81002fc7dcc8 EFLAGS: 00010282 RAX: 810001ce20d0 RBX: RCX: 81002d586530 RDX: RSI: 81002d586540 RDI: RBP: 81002fc7dce8 R08: R09: 81002d586410 R10: R11: R12: R13: 803f06a0 R14: 81002d5557f8 R15: 0002 FS: 2b28fe80() GS:804f8840() knlGS: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: 0004 CR3: 2ca61000 CR4: 06e0 Process khubd (pid: 108, threadinfo 81002fc7c000, task 810001ce20d0) Stack: 803f06a0 81002d5557f8 81002fc7dd08 8035612e 81002d555918 81002d555870 81002fc7dd28 80353b2f Call Trace:8035612e{_spin_lock+30} 80353b2f{klist_remove+31} 802ad11d{__device_release_driver+93} 802ad254{device_release_driver+52} 802ac994{bus_remove_device+180} 802ab7f8{device_del+56} 802d657f{usb_new_device+495} 802d7419{hub_thread+1961} 80354b6f{thread_return+187} 8014a710{autoremove_wake_function+0} 8014a710{autoremove_wake_function+0} 802d6c70{hub_thread+0} 8014a583{kthread+211} 8010f5e6{child_rip+8} 8014a4b0{kthread+0} 8010f5de{child_rip+0} BUG: spinlock trylock failure on UP on CPU#0, khubd/108 lock: 803bf020, .magic: dead4ead, .owner: khubd/108, .owner_cpu: 0 Call Trace:802439f9{add_preempt_count+105} 80243623{spin_bug+211} 8011004b{show_trace+571} 8024370e{_raw_spin_trylock+62} 80355e4e{_spin_trylock+30} 8010fc81{oops_begin+17} 8035702a{do_page_fault+1722} 8013452e{vprintk+830} 8013452e{vprintk+830} 80152296{kallsyms_lookup+246} 8010f431{error_exit+0} 8011004b{show_trace+571} 80110047{show_trace+567} 80110168{show_stack+216} 80110207{show_registers+135} 8011050e{__die+142} 80357098{do_page_fault+1832} 80355fa4{_spin_unlock_irq+20} 80354b6f{thread_return+187} 8010f431{error_exit+0} 8024373b{_raw_spin_lock+27} 802439f9{add_preempt_count+105} 8035612e{_spin_lock+30} 80353b2f{klist_remove+31} 802ad11d{__device_release_driver+93} 802ad254{device_release_driver+52} 802ac994{bus_remove_device+180} 802ab7f8{device_del+56} 802d657f{usb_new_device+495} 802d7419{hub_thread+1961} 80354b6f{thread_return+187} 8014a710{autoremove_wake_function+0} 8014a710{autoremove_wake_function+0} 802d6c70{hub_thread+0} 8014a583{kthread+211} 8010f5e6{child_rip+8} 8014a4b0{kthread+0} 8010f5de{child_rip+0} --- | preempt count: 0003 ] | 3 level deep critical section nesting: .. [80356126]
Re: [patch] suspend: update warnings
Hi. On Tue, 2005-08-23 at 22:50, Pavel Machek wrote: Hi! + * If you have unsupported (*) devices using DMA, you may have some + * problems. If your disk driver does not support suspend... (IDE does), + * it may cause some problems, too. If you change kernel command line + * between suspend and resume, it may do something wrong. If you change + * your hardware while system is suspended... well, it was not good idea; + * but it wil probably only crash. The most common driver issues I see involve: - USB being built in or as modules that are still loaded while suspending (getting better, but not there yet) - DRI being used in X where the drivers don't properly support suspend/resume (NVidia esp) - Firewire - CPU Freq (improving too) It might be good to mention these areas too. Well, right; but those 'only' cause system to crash during suspend. I was talking about really dangerous stuff. Both usb and cpufreq seems to work okay here. It depends on what you're using. I believe one of the usb root hub drivers is okay, the others aren't. Similar for cpufreq. USB certainly accounts for a high percentage of the failures I see. I've added FAQ entry at the end: Q: What information is usefull for debugging suspend-to-disk problems? A: Well, last messages on the screen are always useful. If something is broken, it is usually some kernel driver, therefore trying with as little as possible modules loaded helps a lot. I also prefer people to suspend from console, preferably without X running. Booting with init=/bin/bash, then swapon and starting suspend sequence manually usually does the trick. Then it is good idea to try with latest vanilla kernel. Known problematic modules are; be sure to unload them before suspend: - DRI being used in X where the drivers don't properly support suspend/resume (NVidia esp) - Firewire - SCSI Perhaps the 'changing your hardware' could mention that replacing faulty hardware may be safe. I do not want to encourage people to do that. Yep, its probably safe, no, I do not want them to know. : Thanks Nigel -- Evolution. Enumerate the requirements. Consider the interdependencies. Calculate the probabilities. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13-rc6-rt9
* Steven Rostedt [EMAIL PROTECTED] wrote: On Tue, 2005-08-23 at 14:36 +0200, Ingo Molnar wrote: * Steven Rostedt [EMAIL PROTECTED] wrote: Ingo, can't you get rt.c to be more confusing. I mean it is too simple. We need to add a few more underscores here and there :-) Seriously, that rt.c is mind boggling. It was nice before, now it is just screaming for a cleanup (come now, do we really need the four underscores?). Same with latency.c. i agree that it's ugly, but some of that ugliness is to achieve the 7-instructions fail-through codepath for the common acquire (and release) codepath: c03a5320 __down_mutex: c03a5320: 89 c1 mov%eax,%ecx c03a5322: 8b 15 08 76 3a c0 mov0xc03a7608,%edx c03a5328: 31 c0 xor%eax,%eax c03a532a: 0f b1 51 14 cmpxchg %edx,0x14(%ecx) c03a532e: 85 c0 test %eax,%eax c03a5330: 75 01 jnec03a5333 __down_mutex+0x13 c03a5332: c3 ret Impressive! that's how much it takes to acquire an RT lock, and i worked hard to get there. As long as the fastpath is kept this tight, feel free to do cleanups. But i really want to avoid having to write mutex_down/up in assembly for 24 architectures ... Warning! I'm hacking hard to get rid of the global pi_lock, and I'm not worrying now about efficiency. I figure that if I can get it to work, then we can speed it up afterwards. Since it's complex enough keeping all the locks straight, I just want it to work without deadlocking. Once I get it to work, I'll let you figure out how get it back down to 7-instructions :-) yeah. It can always be done after the fact - the basics wont change. (Note that the above disassembly is for UP, on SMP the fastpath is longer and around 10-15 instructions.) Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] suspend: update warnings
Hi! + * If you have unsupported (*) devices using DMA, you may have some + * problems. If your disk driver does not support suspend... (IDE does), + * it may cause some problems, too. If you change kernel command line + * between suspend and resume, it may do something wrong. If you change + * your hardware while system is suspended... well, it was not good idea; + * but it wil probably only crash. The most common driver issues I see involve: - USB being built in or as modules that are still loaded while suspending (getting better, but not there yet) - DRI being used in X where the drivers don't properly support suspend/resume (NVidia esp) - Firewire - CPU Freq (improving too) It might be good to mention these areas too. Well, right; but those 'only' cause system to crash during suspend. I was talking about really dangerous stuff. Both usb and cpufreq seems to work okay here. It depends on what you're using. I believe one of the usb root hub drivers is okay, the others aren't. Similar for cpufreq. USB certainly accounts for a high percentage of the failures I see. Do you remember which one is it? I have UHCI here, and it seems to work okay. powernow-k8 and cpufreq-centrino also seems to behave ok. Pavel -- if you have sharp zaurus hardware you don't need... you know my address - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] suspend: update warnings
On Tue, Aug 23, 2005 at 02:50:17PM +0200, Pavel Machek wrote: - DRI being used in X where the drivers don't properly support suspend/resume (NVidia esp) NVidias driver is not support and a copyright violation of the copyrights of many of use. It's never supported so please don't mention it. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] suspend: update warnings
Hi! - DRI being used in X where the drivers don't properly support suspend/resume (NVidia esp) NVidias driver is not support and a copyright violation of the copyrights of many of use. It's never supported so please don't mention it. Unfortunately, it is quite common out there. I need to somehow keep those bug reports off my mailbox. Okay, this should be enough: Q: What information is usefull for debugging suspend-to-disk problems? A: Well, last messages on the screen are always useful. If something is broken, it is usually some kernel driver, therefore trying with as little as possible modules loaded helps a lot. I also prefer people to suspend from console, preferably without X running. Booting with init=/bin/bash, then swapon and starting suspend sequence manually usually does the trick. Then it is good idea to try with latest vanilla kernel. Known problematic modules are; be sure to unload them before suspend: - DRI being used (3D acceleration) - Firewire - SCSI -- if you have sharp zaurus hardware you don't need... you know my address - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] suspend: update warnings
On Tue, Aug 23, 2005 at 03:00:50PM +0200, Pavel Machek wrote: Hi! - DRI being used in X where the drivers don't properly support suspend/resume (NVidia esp) NVidias driver is not support and a copyright violation of the copyrights of many of use. It's never supported so please don't mention it. Unfortunately, it is quite common out there. I need to somehow keep those bug reports off my mailbox. I think we made it pretty clear that people with binary modules should sodd off. Feel free to use banner for a big sod off as usual warning for all binary module user idiots. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Ext3 Errors on Dell RAID
Problem: I get massive ext3 errors once every few days. See errors on console section below. Almost all commands return I/O error. I have to power cycle the machine to get it running again. Upon reboot, there are usually 3 orphan inodes deleted and everything is fine. See messages on reboot below. Configuration: System: Dell PowerEdge 6300/500, 4 CPU SMP w/2GB memory Discs: 3 SCSI discs in a controller-managed striped configuration Controller: Dell PERC-2 kernel messages in kernel boot messages below Other: I had this problem before. I upgrade the card firmware to 2.8/build 6809, but still the same issue. I tried with the 2.4.29 kernel (aacraid driver v 1.1-3) from the Slackware (10?) distribution and then I upgraded to 2.4.31. It has the same driver version and same problem. Running fsck always shows everything is fine (rc=0). Does anybody have experience with this machine working well? If so, what combination of kernel and firmware version? Or does anybody know the root cause of the occasional massive ext3 errors or what I can do to test and/or fix it? Please cc me jbalint-at-gmail as I am not on the list. Thanks. Jess -- -- errors on console -- -- EXT3-fs error (device sd(8,2)) in ext3_reserve_inode_write: IO failure EXT3-fs error (device sd(8,2)) in ext3_orphan_add: IO failure EXT3-fs error (device sd(8,2)): ext3_get_inode_loc: unable to read inode block - inode=1015869, block=1015811 EXT3-fs error (device sd(8,2)) in ext3_reserve_inode_write: IO failure EXT3-fs error (device sd(8,2)): ext3_get_inode_loc: unable to read inode block - inode=1015869, block=1015811 EXT3-fs error (device sd(8,2)) in ext3_reserve_inode_write: IO failure EXT3-fs error (device sd(8,2)) in ext3_orphan_add: IO failure EXT3-fs error (device sd(8,2)): ext3_get_inode_loc: unable to read inode block - inode=1213811, block=1212461 EXT3-fs error (device sd(8,2)) in ext3_reserve_inode_write: IO failure EXT3-fs error (device sd(8,2)) in ext3_new_inode: IO failure -- -- messages on reboot -- -- EXT3-fs: INFO: recovery required on readonly filesystem. EXT3-fs: write access will be enabled during recovery. kjournald starting. Commit interval 5 seconds EXT3-fs: sd(8,2): orphan cleanup on readonly fs EXT3-fs: sd(8,2): 3 orphan inodes deleted EXT3-fs: recovery complete. EXT3-fs: mounted filesystem with ordered data mode. VFS: Mounted root (ext3 filesystem) readonly. -- -- kernel boot messages -- -- SCSI subsystem driver Revision: 1.00 Red Hat/Adaptec aacraid driver (1.1-3 Aug 16 2005 17:25:05) AAC0: kernel 2.8.4 build 6089 AAC0: monitor 2.8.4 build 6089 AAC0: bios 2.8.0 build 6089 AAC0: serial 4c72e2fafaf001 scsi0 : percraid Vendor: DELL Model: rootvgRev: V1.0 Type: Direct-Access ANSI SCSI revision: 02 scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36 Adaptec aic7890/91 Ultra2 SCSI adapter aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs scsi2 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36 Adaptec aic7890/91 Ultra2 SCSI adapter aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs scsi3 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36 Adaptec aic7860 Ultra SCSI adapter aic7860: Ultra Single Channel A, SCSI Id=7, 3/253 SCBs blk: queue f7aaca18, I/O limit 4095Mb (mask 0x) (scsi3:A:5): 20.000MB/s transfers (20.000MHz, offset 15) Vendor: NEC Model: CD-ROM DRIVE:465 Rev: 1.03 Type: CD-ROM ANSI SCSI revision: 02 blk: queue f7aac818, I/O limit 4095Mb (mask 0x) Attached scsi disk sda at scsi0, channel 0, id 0, lun 0 SCSI device sda: 213274368 512-byte hdwr sectors (109196 MB) Partition check: sda: sda1 sda2 Attached scsi CD-ROM sr0 at scsi3, channel 0, id 5, lun 0 sr0: scsi3-mmc drive: 14x/32x cd/rw xa/form2 cdda tray - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13-rc6-mm2 (hangs on non-SMP x86-64 and oopses)
Andrew, On Tue, Aug 23, 2005 at 02:51:51PM +0200, Rafael J. Wysocki wrote: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc6/2.6.13-rc6-mm2/ - Various updates. Nothing terribly noteworthy. It hangs solig during boot (after starting kjournald) on Asus L5D (non-SMP x86-64), which is caused by this patch: 8250-serial-console-locking-bug-spelling-fix.patch (from binary search). If this patch is reverted, it oopses like in the following trace. I thought this one was already pulled? Ralf - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[2.4.31] - USB device numbering in /proc/bus/usb
Hello, I've just rebooted a machine, and the eagle ADSL modem I was using, presented as /proc/bus/usb/002/005 in now presented as /proc/bus/usb/002/003 (same bus, but device ID changed from 5 to 3). Is this an expected behavior, when running a 2.4.31 kernel ? I would have been expecting some more stability in the numbering across reboot, the same way IDE disks numbers are stable. Paul - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] asus_acpi M6000A model support
Hello, here is patch for Asus M6A laptop support. It works fine for me. -- Lukáš Hejtmánek --- asus_acpi.c.old 2005-04-21 02:03:13.0 +0200 +++ asus_acpi.c 2005-05-08 18:22:49.0 +0200 @@ -128,6 +128,7 @@ L8L, //L8400L M1A, //M1300A M2E, //M2400E, L4400L + M6A, //M6000A M6N, //M6800N M6R, //M6700R P30, //Samsung P30 @@ -304,7 +305,20 @@ .display_set = SDSP, .display_get = \\INFB }, - + { + .name = M6A, + /* M6A does not have MLED */ + .mt_wled = WLED, + .mt_lcd_switch = xxN_PREFIX _Q10, + .lcd_status= \\RGPL, + .brightness_set= SPLV, + .brightness_get= GPLV, + .display_set = SDSP, + /* FIXME: this is not correct display_get. +* It always returns 1 +* */ + .display_get = \\ADVG + }, { .name = M6N, .mt_mled = MLED, @@ -622,7 +636,7 @@ { int lcd = 0; - if (hotk-model != L3H) { + if (hotk-model != L3H hotk-model != M6A) { /* We don't have to check anything if we are here */ if (!read_acpi_int(NULL, hotk-methods-lcd_status, lcd)) printk(KERN_WARNING Asus ACPI: Error reading LCD status\n); @@ -638,22 +652,33 @@ input.count = 2; input.pointer = mt_params; - /* Note: the following values are partly guessed up, but - otherwise they seem to work */ mt_params[0].type = ACPI_TYPE_INTEGER; - mt_params[0].integer.value = 0x02; mt_params[1].type = ACPI_TYPE_INTEGER; - mt_params[1].integer.value = 0x02; + if(hotk-model == L3H) { + /* Note: the following values are partly guessed up, +* but otherwise they seem to work */ + mt_params[0].integer.value = 0x02; + mt_params[1].integer.value = 0x02; + } else if(hotk-model == M6A) { + mt_params[0].integer.value = 0x15; + mt_params[1].integer.value = 0x01; + } output.length = sizeof(out_obj); output.pointer = out_obj; - status = acpi_evaluate_object(NULL, hotk-methods-lcd_status, input, output); + status = acpi_evaluate_object(NULL, hotk-methods-lcd_status, + input, output); if (status != AE_OK) return -1; - if (out_obj.type == ACPI_TYPE_INTEGER) - /* That's what the AML code does */ - lcd = out_obj.integer.value 8; + if (out_obj.type == ACPI_TYPE_INTEGER) { + if(hotk-model== L3H) { + /* That's what the AML code does */ + lcd = out_obj.integer.value 8; + } else if(hotk-model == M6A) { + lcd = out_obj.integer.value; + } + } } return (lcd 1); @@ -1029,6 +1054,8 @@ hotk-model = M6N; else if (strncmp(model-string.pointer, M6R, 3) == 0) hotk-model = M6R; + else if (strncmp(model-string.pointer, M6A, 3) == 0) + hotk-model = M6A; else if (strncmp(model-string.pointer, M2N, 3) == 0 || strncmp(model-string.pointer, M3N, 3) == 0 || strncmp(model-string.pointer, M5N, 3) == 0 || @@ -1058,8 +1085,9 @@ hotk-model = L5x; if (hotk-model == END_MODEL) { - printk(unsupported, trying default values, supply the - developers with your DSDT\n); + printk(unsupported model %s, trying default values, supply + the developers with your DSDT\n, + model-string.pointer); hotk-model = M2E; } else { printk(supported\n);
Re: Linux AIO status todo
Le mar 23/08/2005 à 11:56, Jakub Jelinek a écrit : On Tue, Aug 23, 2005 at 01:14:38PM +0530, Suparna Bhattacharya wrote: 2. No support for propagating IO completion events to user space threads using RT signals. User threads need to poll the completion queue using io_getevents. POSIX specifies that when an AIO request completes, a signal can be delivered to the application to indicate the completion of the IO. POSIX AIO needs to handle SIGEV_NONE, SIGEV_SIGNAL and SIGEV_THREAD notification. Obviously kernel shouldn't create threads for SIGEV_THREAD itself, as kernel shouldn't hardcode all the implementation details how a thread can be created. But it would be good if AIO signalling e.g. handled both SIGEV_SIGNAL and SIGEV_SIGNAL | SIGEV_THREAD_ID, with the same usage as e.g. timer_* syscalls. If kernel makes sure SI_ASYNCIO si_code is set in the notification signal siginfos, glibc could even use just one helper thread for timer_*/[al]io_* and maybe in the future other SIGEV_THREAD notification. See chapter 2.2. AIO completion event. The libposix-aio written by Sébastien and I manages all these cases: http://www.bullopensource.org/posix/ There is a patch allowing kernel to send signal to a given process on aio event completion: http://cvs.sourceforge.net/viewcvs.py/paiol/kernel-patches/2.6.12/aioevent.patch?rev=1.1.1.1view=auto With the help of an helper thread in the user space, the libposix-aio is able to manage SIGEV_THREAD and create new thread by using user space code (and thus implementation dependent calls): http://cvs.sourceforge.net/viewcvs.py/paiol/libposix-aio/src/aio_read.c?view=markup http://cvs.sourceforge.net/viewcvs.py/paiol/libposix-aio/src/aio_thread_create.c?view=markup Sébastien wrote this part of libposix-aio (So I'm not an expert on this part :-P ), but I think his helper thread is made like the glibc timer helper thread is made. And thus, if we want to merge libposix-aio in glibc, we should use existing mechanism, and it should be easy to put POSIX AIO helper thread portions inside the timer helper thread. But only the glibc maintainer can answer to this question: should we mixe timer and AIO code ? Laurent -- -- Laurent Vivier --- mailto:[EMAIL PROTECTED] BULL/FREC:B1-226 phone: (+33) 476 29 7213 Bullcom: 229-7213 --[ DT/OSwRD/AIX ]-- http://www.bullopensource.org/ext4 signature.asc Description: Ceci est une partie de message numériquement signée.
dnotify/inotify and vfs questions
Hi, I'm currently implementing change notification support for the linux cifs client as part of Google's Summer of Code program. In cifs, change notification works pretty much the same as dnotify does in the kernel, and you cancel the notification by sending a NT_CANCEL request. According to the fcntl manual you can cancel a notification by doing fcntl(fd, F_NOTIFY, 0) (ie. sending 0 as the notification mask), but looking in the kernel code fcntl_dirnotify() immediately calls dnotify_flush() with neither telling the vfs module about it. Is there a reason for this? Otherwise I'd propose calling filp-f_op-dir_notify(filp, 0) at some point in this scenario. Regarding inotify, inotify_add_watch doesn't seem to pass on the request either, which works fine for local filesystem operations as they call fsnotify_* functions every time, but that isn't really feasible for filesystems like cifs because we'd have to request change notification on everything. Is there plans for implementing a mechanism to let vfs modules get watch requests too? cheers, Asser pgps8E5TYYiFC.pgp Description: PGP signature
[PATCH] FUTEX_WAKE_OP (pthread_cond_signal speedup)
Hi! ATM pthread_cond_signal is unnecessarily slow, because it wakes one waiter (which at least on UP usually means an immediate context switch to one of the waiter threads). This waiter wakes up and after a few instructions it attempts to acquire the cv internal lock, but that lock is still held by the thread calling pthread_cond_signal. So it goes to sleep and eventually the signalling thread is scheduled in, unlocks the internal lock and wakes the waiter again. Now, before 2003-09-21 NPTL was using FUTEX_REQUEUE in pthread_cond_signal to avoid this performance issue, but it was removed when locks were redesigned to the 3 state scheme (unlocked, locked uncontended, locked contended). Following scenario shows why simply using FUTEX_REQUEUE in pthread_cond_signal together with using lll_mutex_unlock_force in place of lll_mutex_unlock is not enough and probably why it has been disabled at that time: The number is value in cv-__data.__lock. thr1thr2thr3 0 pthread_cond_wait 1 lll_mutex_lock (cv-__data.__lock) 0 lll_mutex_unlock (cv-__data.__lock) 0 lll_futex_wait (cv-__data.__futex, futexval) 0 pthread_cond_signal 1 lll_mutex_lock (cv-__data.__lock) 1 pthread_cond_signal 2 lll_mutex_lock (cv-__data.__lock) 2 lll_futex_wait (cv-__data.__lock, 2) 2 lll_futex_requeue (cv-__data.__futex, 0, 1, cv-__data.__lock) # FUTEX_REQUEUE, not FUTEX_CMP_REQUEUE 2 lll_mutex_unlock_force (cv-__data.__lock) 0 cv-__data.__lock = 0 0 lll_futex_wake (cv-__data.__lock, 1) 1 lll_mutex_lock (cv-__data.__lock) 0 lll_mutex_unlock (cv-__data.__lock) # Here, lll_mutex_unlock doesn't know there are threads waiting # on the internal cv's lock Now, I believe it is possible to use FUTEX_REQUEUE in pthread_cond_signal, but it will cost us not one, but 2 extra syscalls and, what's worse, one of these extra syscalls will be done for every single waiting loop in pthread_cond_*wait. We would need to use lll_mutex_unlock_force in pthread_cond_signal after requeue and lll_mutex_cond_lock in pthread_cond_*wait after lll_futex_wait. Another alternative is to do the unlocking pthread_cond_signal needs to do (the lock can't be unlocked before lll_futex_wake, as that is racy) in the kernel. I have implemented both variants, futex-requeue-glibc.patch is the first one and futex-wake_op{,-glibc}.patch is the unlocking inside of the kernel. The kernel interface allows userland to specify how exactly an unlocking operation should look like (some atomic arithmetic operation with optional constant argument and comparison of the previous futex value with another constant). It has been implemented just for ppc*, x86_64 and i?86, for other architectures I'm including just a stub header which can be used as a starting point by maintainers to write support for their arches and ATM will just return -ENOSYS for FUTEX_WAKE_OP. The requeue patch has been (lightly) tested just on x86_64, the wake_op patch on ppc64 kernel running 32-bit and 64-bit NPTL and x86_64 kernel running 32-bit and 64-bit NPTL. With the following benchmark on UP x86-64 I get: for i in nptl-orig nptl-requeue nptl-wake_op; do echo time elf/ld.so --library-path .:$i /tmp/bench; \ for j in 1 2; do echo ( time elf/ld.so --library-path .:$i /tmp/bench ) 21; done; done time elf/ld.so --library-path .:nptl-orig /tmp/bench real 0m0.655s user 0m0.253s sys 0m0.403s real 0m0.657s user 0m0.269s sys 0m0.388s time elf/ld.so --library-path .:nptl-requeue /tmp/bench real 0m0.496s user 0m0.225s sys 0m0.271s real 0m0.531s user 0m0.242s sys 0m0.288s time elf/ld.so --library-path .:nptl-wake_op /tmp/bench real 0m0.380s user 0m0.176s sys 0m0.204s real 0m0.382s user 0m0.175s sys 0m0.207s The benchmark is at: http://sourceware.org/ml/libc-alpha/2005-03/txt1.txt Older futex-requeue-glibc.patch version is at: http://sourceware.org/ml/libc-alpha/2005-03/txt2.txt Older futex-wake_op-glibc.patch version is at: http://sourceware.org/ml/libc-alpha/2005-03/txt3.txt Will post a new version (just x86-64 fixes so that the patch applies against pthread_cond_signal.S) to libc-hacker ml soon. Attached is the kernel FUTEX_WAKE_OP patch as well as a simple-minded testcase that will not test the atomicity of the operation, but at least check if the threads that should have been woken up are woken up and whether the arithmetic operation in the kernel gave the expected results. Jakub --- linux-2.6.12/include/linux/futex.h.jj 2005-06-17 21:48:29.0 +0200 +++ linux-2.6.12/include/linux/futex.h 2005-08-23 11:11:41.0 +0200 @@ -4,14 +4,40 @@ /* Second argument to futex syscall */ -#define FUTEX_WAIT (0) -#define
irq 11: nobody cared
Hail, I posted a report a while back, no answer. Who should I be talking to wrt to the irq 11: nobody cared issue? I'm happy to provide as much info as possible but need to know what info is required. I'm happily running 2.6.7, tried the latest and greatest (2.6.12) and found the problem, then started by looking at 2.6.8 and found the problem there too. It happens on boot, is a showstopper and I'm wondering what, if anything useful I can provide you guys. Throw me a bone... Nige - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: debug a high load average
On Tue, Aug 23, 2005 at 04:38:36PM +0530, Rajesh wrote: I have a case occasionally when I copy data from a usb storage (ipod) to my hard drive the load average goes up from 0.4 to about 15.0, and the system becomes very unusable till I kill the cp command. I have checked the CPU usage, bytes read from usb device, byte written to hard drive etc, and all these values are low like CPU usage is at a maximum of 30%, disk read bytes is at an average of 1.5 MiB/s, disk write bytes is at 1.5 MiB/s, number of processes is at 110, etc, during this high load. 1.5 MB/s suggests you're using an IDE drive in PIO mode. Switch to DMA mode (hdparm -d 1 /dev/hda) and see if it gets any better. Erik -- +-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 -- | Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: IRQ problem with PCMCIA
On Tue, Aug 23, 2005 at 11:31:58AM +0100, Alan Cox wrote: On Maw, 2005-08-23 at 09:49 +0200, Erik Mouw wrote: Is there any place where we can get your current patches? Which ones - the PATA IDE ones are in 2.6.11-ac, a subset in Fedora (other changes in the core IDE code make forward porting stuff for hotplug really tricky past 2.6.11). I know about those and have been using them on my laptop. The SATA ones I can certainly put up if there is interest. I don't want to put them somewhere too available yet because this right now is stuff you only want to use under controlled circumstances for development until both they and the core SATA layer have some improvements. That's the one I'm interested in. Yes, I do understand it can erase all my partitions, etc. Erik -- +-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 -- | Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Ext3 Errors on Dell RAID
On Tue, Aug 23, 2005 at 09:05:27AM -0400, Jess Balint wrote: Problem: I get massive ext3 errors once every few days. See errors on console section below. Almost all commands return I/O error. I have to power cycle the machine to get it running again. Upon reboot, there are usually 3 orphan inodes deleted and everything is fine. See messages on reboot below. Configuration: System: Dell PowerEdge 6300/500, 4 CPU SMP w/2GB memory Discs: 3 SCSI discs in a controller-managed striped configuration Controller: Dell PERC-2 kernel messages in kernel boot messages below This looks very familiar, and given the firmware versions you mention, is probably a known issue. The controller firmware goes to do a cache flush, but that doesn't complete in a sane amount of time, and eventually the SCSI midlayer starts aborting commands and taking the file system offline. I don't believe a firmware update was released for your add-in PERC2 quad-channel card. Firmware 6091 was released for the PERC3/Di ROMBs which addresses this exact case, though other failures have been reported on [EMAIL PROTECTED] (subscribe and read archives at http://lists.us.dell.com) even with newer firmware. The workarounds include: 1) disable the read and write cache using afacli. 2) mount file systems using 'noatime'. 3) backup your data, replace the controller with something newer (disks on the onboard aic7xxx controller combined with Linux Software RAID works quite well), recreate your RAID array on the new controller, and restore your data from backups. Thanks, Matt -- Matt Domsch Software Architect Dell Linux Solutions linux.dell.com www.dell.com/linux Linux on Dell mailing lists @ http://lists.us.dell.com - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: what does scsi sense means?
On Tue, Aug 23, 2005 at 05:07:12PM +0800, jeff shia wrote: in the file of aic7.c ,what is the function of the structure of scsi_sense?here what is the meaning of sense?just like probe? Return value of a failed command. Normally commands just succeed, but if it fails, you can get sense information which tells you more about why a particular command failed. Erik -- +-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 -- | Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] fix whitespace handling on sysfs attributes
The first version of this patch didn't allow for the request firmware case which does multiple parsing passes on the parameter. This was discussed in the thread '2.6.13-rc6-mm1' gregkh-driver-sysfs-strip_leading_trailing_whitespace-3.patch should replace in 2.6.13-rc6-mm1 gregkh-driver-sysfs-strip_leading_trailing_whitespace.patch Signed-off-by: Jon Smirl [EMAIL PROTECTED] -- Jon Smirl [EMAIL PROTECTED] diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c --- a/fs/sysfs/file.c +++ b/fs/sysfs/file.c @@ -6,6 +6,7 @@ #include linux/fsnotify.h #include linux/kobject.h #include linux/namei.h +#include linux/ctype.h #include asm/uaccess.h #include asm/semaphore.h @@ -207,8 +208,41 @@ flush_write_buffer(struct dentry * dentr struct attribute * attr = to_attr(dentry); struct kobject * kobj = to_kobj(dentry-d_parent); struct sysfs_ops * ops = buffer-ops; + size_t ws_count = count, leading = 0; + int ret = 0; + char *x; - return ops-store(kobj,attr,buffer-page,count); + /* locate trailing white space */ + while ((ws_count 0) isspace(buffer-page[ws_count - 1])) + ws_count--; + if (ws_count == 0) + return count; + + /* locate leading white space */ + x = buffer-page; + while (isspace(*x)) + x++; + leading = x - buffer-page; + ws_count -= leading; + + /* interface is still ambigous about this */ + /* string is both passed by length and terminated */ + if (ws_count != PAGE_SIZE) + x[ws_count] = '\0'; + + ret = ops-store(kobj, attr, x, ws_count); + + /* is it an error? */ + if (ret 0) + return ret; + + /* the whole string was consumed */ + if (ret == ws_count) + return count; + + /* only part of the string was consumed */ + /* return count can not include trailing space */ + return leading + ret; }
Re: [PATCH] mm: return ENOBUFS instead of ENOMEM in generic_file_buffered_write
On 8/23/05, Christoph Hellwig [EMAIL PROTECTED] wrote: On Tue, Aug 23, 2005 at 11:46:33AM +0300, Pekka J Enberg wrote: As noticed by Dmitry Torokhov, write() can not return ENOMEM: http://www.opengroup.org/onlinepubs/95399/functions/write.html Therefore fixup generic_file_buffered_write() in mm/filemap.c (pointed out by Nathan Scott). We had this discussion before, for EACCESS then. We've always been returning more errnos than SuS mentioned and Linus declared it's fine. So does that mean that any error code is allowed? I would love to be able to return ENODEV from a sysfs attribute if its device happens to be removed in process. Is there a list of valid errnos for Linux that supercedes SuS? -- Dmitry - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: CONFIG_PRINTK_TIME woes
I'd hate to have to test for something for CONFIG_PRINTK_TIME every time sched_clock() is being called. Me too. The quick fix would seem to be to only allow CONFIG_PRINTK_TIME from kernel cmdline to make it happen a bit later. So basically make int printk_time = 0 until command line is evaluated. Good thought, but this won't work for ia64 in the hot-plug cpu case. There are a couple of printk() calls by new cpus as they boot before they have set-up their per-cpu areas. So there is no global state that can be checked to decide whether it is safe for printk() to call sched_clock(). -Tony - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Posix file attribute support on VFAT (take #2)
On Mon, Aug 22, 2005 at 01:46:29PM +0200, Pavel Machek wrote: Unfortunately, it makes sense. If you have compact flash card, you really want to have VFAT there, so that it is a) compatible with windows and b) so that you don't kill the hardware. VFAT is plenty good at killing hardware. It's a terrible filesystem for flash cards (if they don't do their own wear leveling properly). Most of the linux filesystems may not be any better but they are also no worse. Windows compatibility is completely irrelevant if the card is being used as your root filesystem since any extensions you make to vfat wouldn't be understood by windows anyhow, so at best it makes a mess of it. I guess being able to use CF card for root filesystem is usefull, too I run ext3 on CF and so far, no problems. I run with noatime and try to avoid writing in general as much as possible. VFAT would be crap since, well, I run linux on the system. Len Sorensen - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
another Followup on 2.6.13-rc3 ACPI processor C-state regression
(It looks like my first try to send this message as a reply to the Followup ... didn't work. If it worked: sorry for double-post) I use 2.6.13-rc6-mm1 which includes the patch as far as i can see, but the C2 idle state (which my processor definetly supports) isn't detected . it also isn't detected with 2.6.13-rc6 or 2.6.12.5 . but it definetly worked with some older 2.6.x kernel. is there any way to enforce using c2 ? so that you could say that the acpi system uses c2 even if it is unable to detect that it is supported ? daniel (please CC me, cause i am not on the list at the moment) -- # Daniel Nofftz .. # This message was sent using IMP, the Internet Messaging Program. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: Incorrect RAM Detected at kernel init
On Sun, Aug 21, 2005 at 11:27:51PM -0400, Terry wrote: Not sure if I have provided enough info, or to much info, but here it goes: [1.] One line summary of the problem: Not Detecting all the memory installed in the system. [2.] Full description of the problem/report: I have Linux Kernel 2.4.31 running on a Compaq 5000R server with 2 PPro 200 processors, 768M RAM, RealTeck 8139 Network Card, and Compaq Smart 2 Raid controller with 5 9.1G drives in Raid 5 configuration. The kernel appears to compile perfectly, installs fine, but after reboot it is only reporting 16M of RAM. I have tried with and without the mem=768M boot up option in the lilo.conf script. All other modules and boot up includes appear to run perfectly fine. I had a 2.4.18 kernel running on this box just fine, detected all 768M of RAM and ran perfectly. The 2.4.31 Kernel runs almost perfectly, the only hold back is the false detection of memory. Compaq machines of that era are known to have non standard bios methods for identifying ram. Do a google search for how to pass memory maps to 2.6 kernels on a compaq. ie something like: mem=exactmap [EMAIL PROTECTED] [EMAIL PROTECTED] Add that to the kernel command line when booting and see what happens. Len Sorensen - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] FUTEX_WAKE_OP (pthread_cond_signal speedup)
On Tue, 23 Aug 2005, Jakub Jelinek wrote: Hi! ATM pthread_cond_signal is unnecessarily slow, because it wakes one waiter (which at least on UP usually means an immediate context switch to one of the waiter threads). This waiter wakes up and after a few instructions it attempts to acquire the cv internal lock, but that lock is still held by the thread calling pthread_cond_signal. So it goes to sleep and eventually the signalling thread is scheduled in, unlocks the internal lock and wakes the waiter again. With the following benchmark on UP x86-64 I get: for i in nptl-orig nptl-requeue nptl-wake_op; do echo time elf/ld.so --library-path .:$i /tmp/bench; \ for j in 1 2; do echo ( time elf/ld.so --library-path .:$i /tmp/bench ) 21; done; done time elf/ld.so --library-path .:nptl-orig /tmp/bench real 0m0.655s user 0m0.253s sys 0m0.403s real 0m0.657s user 0m0.269s sys 0m0.388s time elf/ld.so --library-path .:nptl-requeue /tmp/bench real 0m0.496s user 0m0.225s sys 0m0.271s real 0m0.531s user 0m0.242s sys 0m0.288s time elf/ld.so --library-path .:nptl-wake_op /tmp/bench real 0m0.380s user 0m0.176s sys 0m0.204s real 0m0.382s user 0m0.175s sys 0m0.207s translation: effective thread switching is now almost twice as fast with the WAKE_OP extension of the futex interface. Cool! a detail: many of the futex_atomic_op_inuser() seem to be duplicated across architectures. Might be worth putting into asm-generic, to avoid the duplication? Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] FUTEX_WAKE_OP (pthread_cond_signal speedup)
On Tue, Aug 23, 2005 at 10:36:08AM -0400, Ingo Molnar wrote: a detail: many of the futex_atomic_op_inuser() seem to be duplicated across architectures. Might be worth putting into asm-generic, to avoid the duplication? Those are stub files waiting for arch maintainers to actually implement them, so they will be eventually different, but for the time being they just -ENOSYS, so that things compile. Jakub - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3com 3c59x stopped working with 2.6.13-rc[56]
Hello I assume it worked OK in 2.6.12. Yes, sorry, forgot to mention that. 18:27:47: eth1: Setting full-duplex based on MII #24 link partner capability of 05e1. 18:32:02: NETDEV WATCHDOG: eth1: transmit timed out 18:32:02: eth1: transmit timed out, tx_status 00 status e601. 18:32:02: diagnostics: net 0cfa media 8880 dma 003a fifo 8800 18:32:02: eth1: Interrupt posted but not delivered -- IRQ blocked by another device? gargh, I have acpi feelings. Could you please It seems you had a good hunch. a) Compare /proc/interrupts for 2.6.12 and 2.6.13-rc6 /proc/interrputs for 2.6.12: CPU0 0: 76133896 XT-PIC timer 1: 1170 XT-PIC i8042 2: 0 XT-PIC cascade 9: 0 XT-PIC acpi 11:2483056 XT-PIC eth1 14: 603767 XT-PIC ide0 15: 13 XT-PIC ide1 NMI: 0 ERR: 0 /proc/interrputs for 2.6.13: CPU0 0: 851172 XT-PIC timer 1:802 XT-PIC i8042 2: 0 XT-PIC cascade 5: 0 XT-PIC eth1 14: 30180 XT-PIC ide0 15: 13 XT-PIC ide1 NMI: 0 ERR: 0 What is missing is acpi on irq9 b) Generate the boot-time dmesg output for 2.6.12 and 2.6.13-rc6 (dmesg -s 100 foo), then do diff -u dmesg-2.6.12 dmesg-2.6.13-rc6 foo and send foo? Here is foo: --- dmesg-2.6.122005-08-23 12:53:43.0 +0200 +++ dmesg-2.6.132005-08-23 14:26:54.0 +0200 @@ -1,4 +1,4 @@ -Linux version 2.6.12 ([EMAIL PROTECTED]) (gcc version 4.0.1 (Debian 4.0.1-2)) #1 Mon Aug 22 14:49:40 CEST 2005 +Linux version 2.6.13-rc6-git13 ([EMAIL PROTECTED]) (gcc version 4.0.1 (Debian 4.0.1-2)) #2 Mon Aug 22 15:22:10 CEST 2005 BIOS-provided physical RAM map: BIOS-e820: - 000a (usable) BIOS-e820: 000f - 0010 (reserved) @@ -13,25 +13,20 @@ Normal zone: 126956 pages, LIFO batch:31 HighMem zone: 0 pages, LIFO batch:1 DMI 2.3 present. -ACPI: RSDP (v000 ASUS ) @ 0x000f6c20 -ACPI: RSDT (v001 ASUS A7V266-C 0x30303031 MSFT 0x31313031) @ 0x1ffec000 -ACPI: FADT (v001 ASUS A7V266-C 0x30303031 MSFT 0x31313031) @ 0x1ffec080 -ACPI: BOOT (v001 ASUS A7V266-C 0x30303031 MSFT 0x31313031) @ 0x1ffec040 -ACPI: DSDT (v001 ASUS A7V266-C 0x1000 MSFT 0x010b) @ 0x Allocating PCI resources starting at 2000 (gap: 2000:dfff) Built 1 zonelists -Kernel command line: BOOT_IMAGE=Linux.old ro root=301 lapic pci=usepirqmask +Kernel command line: auto BOOT_IMAGE=Linux ro root=301 lapic pci=usepirqmask Initializing CPU#0 -CPU 0 irqstacks, hard=c0442000 soft=c0441000 +CPU 0 irqstacks, hard=c041b000 soft=c041a000 PID hash table entries: 2048 (order: 11, 32768 bytes) Detected 1210.984 MHz processor. Using tsc for high-res timesource Console: colour VGA+ 80x25 Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) -Memory: 515260k/524208k available (2114k kernel code, 8412k reserved, 801k data, 392k init, 0k highmem) +Memory: 515424k/524208k available (2010k kernel code, 8252k reserved, 753k data, 388k init, 0k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. -Calibrating delay loop... 2383.87 BogoMIPS (lpj=1191936) +Calibrating delay using timer specific routine.. 2424.59 BogoMIPS (lpj=1212295) Mount-cache hash table entries: 512 CPU: After generic identify, caps: 0383f9ff c1cbf9ff CPU: After vendor identify, caps: 0383f9ff c1cbf9ff @@ -40,65 +35,49 @@ CPU: After all inits, caps: 0383f9ff c1cbf9ff 0020 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. +mtrr: v2.0 (20020519) CPU: AMD Duron(TM)Processor stepping 01 Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. -ACPI: setting ELCR to 0200 (from 0400) NET: Registered protocol family 16 PCI: PCI BIOS revision 2.10 entry at 0xf0f00, last bus=1 PCI: Using configuration type 1 -mtrr: v2.0 (20020519) -ACPI: Subsystem revision 20050309 -ACPI: Interpreter enabled -ACPI: Using PIC for interrupt routing -ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled. -ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled. -ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled. -ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 *10 11 12 14 15) -ACPI: PCI Root Bridge [PCI0] (:00) +PCI: Probing PCI hardware PCI: Probing PCI hardware (bus 00) Boot video device is :01:00.0 -ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] -ACPI: PCI Interrupt
Re: irq 11: nobody cared
Nigel Rantor wrote: Who should I be talking to wrt to the irq 11: nobody cared issue? I'm happy to provide as much info as possible but need to know what info is required. I'm happily running 2.6.7, tried the latest and greatest (2.6.12) and found the problem, then started by looking at 2.6.8 and found the problem there too. Try 2.6.13-rc6 and if it still appears, try the new irqpoll boot option. Daniel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] pci: Block config access during BIST (resend)
Brian King wrote: Greg KH wrote: Here is an updated patch which will now fail writes to config space while the device is blocked. I have also fixed up the caching to return the correct data and tested it on both little endian and big endian machines. Applied, thanks. greg k-h Greg, This patch appears to have been dropped. Please apply. Thanks -- Brian King eServer Storage I/O IBM Linux Technology Center Some PCI adapters (eg. ipr scsi adapters) have an exposure today in that they issue BIST to the adapter to reset the card. If, during the time it takes to complete BIST, userspace attempts to access PCI config space, the host bus bridge will master abort the access since the ipr adapter does not respond on the PCI bus for a brief period of time when running BIST. On PPC64 hardware, this master abort results in the host PCI bridge isolating that PCI device from the rest of the system, making the device unusable until Linux is rebooted. This patch is an attempt to close that exposure by introducing some blocking code in the PCI code. When blocked, writes will be humored and reads will return the cached value. Ben Herrenschmidt has also mentioned that he plans to use this in PPC power management. Signed-off-by: Brian King [EMAIL PROTECTED] --- linux-2.6-bjking1/drivers/pci/access.c| 86 ++ linux-2.6-bjking1/drivers/pci/pci-sysfs.c | 20 +++--- linux-2.6-bjking1/drivers/pci/pci.h |7 ++ linux-2.6-bjking1/drivers/pci/proc.c | 28 - linux-2.6-bjking1/drivers/pci/syscall.c | 14 ++-- linux-2.6-bjking1/include/linux/pci.h |5 + 6 files changed, 129 insertions(+), 31 deletions(-) diff -puN drivers/pci/access.c~pci_block_user_config_io_during_bist_again drivers/pci/access.c --- linux-2.6/drivers/pci/access.c~pci_block_user_config_io_during_bist_again 2005-08-22 17:00:21.0 -0500 +++ linux-2.6-bjking1/drivers/pci/access.c 2005-08-22 17:00:21.0 -0500 @@ -60,3 +60,89 @@ EXPORT_SYMBOL(pci_bus_read_config_dword) EXPORT_SYMBOL(pci_bus_write_config_byte); EXPORT_SYMBOL(pci_bus_write_config_word); EXPORT_SYMBOL(pci_bus_write_config_dword); + +static u32 pci_user_cached_config(struct pci_dev *dev, int pos) +{ + u32 data; + + data = dev-saved_config_space[pos/sizeof(dev-saved_config_space[0])]; + data = (pos % sizeof(dev-saved_config_space[0])) * 8; + return data; +} + +#define PCI_USER_READ_CONFIG(size,type) \ +int pci_user_read_config_##size \ + (struct pci_dev *dev, int pos, type *val) \ +{ \ + unsigned long flags;\ + int ret = 0;\ + u32 data = -1; \ + if (PCI_##size##_BAD) return PCIBIOS_BAD_REGISTER_NUMBER; \ + spin_lock_irqsave(pci_lock, flags);\ + if (likely(!dev-block_ucfg_access))\ + ret = dev-bus-ops-read(dev-bus, dev-devfn, \ + pos, sizeof(type), data); \ + else if (pos sizeof(dev-saved_config_space)) \ + data = pci_user_cached_config(dev, pos);\ + spin_unlock_irqrestore(pci_lock, flags); \ + *val = (type)data; \ + return ret; \ +} + +#define PCI_USER_WRITE_CONFIG(size,type) \ +int pci_user_write_config_##size \ + (struct pci_dev *dev, int pos, type val)\ +{ \ + unsigned long flags;\ + int ret = -EIO; \ + if (PCI_##size##_BAD) return PCIBIOS_BAD_REGISTER_NUMBER; \ + spin_lock_irqsave(pci_lock, flags);\ + if (likely(!dev-block_ucfg_access))\ + ret = dev-bus-ops-write(dev-bus, dev-devfn,\ + pos, sizeof(type), val);\ + spin_unlock_irqrestore(pci_lock, flags); \ + return ret; \ +} + +PCI_USER_READ_CONFIG(byte, u8) +PCI_USER_READ_CONFIG(word, u16) +PCI_USER_READ_CONFIG(dword, u32) +PCI_USER_WRITE_CONFIG(byte, u8) +PCI_USER_WRITE_CONFIG(word, u16) +PCI_USER_WRITE_CONFIG(dword, u32) + +/** + * pci_block_user_cfg_access - Block userspace PCI config reads/writes + * @dev: pci device struct + * + * This
Re: [Samba] Re: New maintainer needed for the Linux smb filesystem
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Ian Kent wrote: On Sun, 21 Aug 2005, Gerald (Jerry) Carter wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Steven French wrote: | | We are close, but not quite ready to disable smbfs. Steve, I have been itching to work on some kernel code. If you need someone just to keep things afloat, I'd been happy to look into it. There would be some start up time of course. If you would be willing to help me navigate the things other than code, it shouldn't be that big of a deal. I wouldn't mind helping out here either. Perhaps a joint effort Jerry? That's fine by me. Steve, I'll touch base with on #samba-technical to work out what to do first. I know we have had a lot of reports on https://bugzilla.samba.org/ that were originally closed as invalid since were weren't supporting the kernel smbfs code at that time. cheers, jerry -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.0 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFDCzvsIR7qMdg1EfYRAga/AKCTUZpLIL6oUrpg5gOiPOc80e3KjQCeNv0I XKnUztDUIKyR+3uon+ofKB4= =BwsH -END PGP SIGNATURE- - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH][RESEND] don't allow sys_readahead() on files opened with O_DIRECT
IMO sys_readahead() doesn't make sense if the file is opened with O_DIRECT, because the page cache is stuffed but never used. Therefore this patch changes that by letting the call return with -EINVAL. Signed-off-by: Jan Blunck [EMAIL PROTECTED] mm/filemap.c |3 ++- 1 files changed, 2 insertions(+), 1 deletion(-) Index: experimental-jb/mm/filemap.c === --- experimental-jb.orig/mm/filemap.c +++ experimental-jb/mm/filemap.c @@ -,7 +,8 @@ static ssize_t do_readahead(struct address_space *mapping, struct file *filp, unsigned long index, unsigned long nr) { - if (!mapping || !mapping-a_ops || !mapping-a_ops-readpage) + if (!mapping || !mapping-a_ops || !mapping-a_ops-readpage + || (filp-f_flags O_DIRECT)) return -EINVAL; force_page_cache_readahead(mapping, filp, index,
Re: [PATCH 2.6.13-rc6] cpu_exclusive sched domains on partial nodes temp fix
On Tue, Aug 23, 2005 at 01:04:27AM -0700, Paul Jackson wrote: If Dinakar, Hawkes and Nick concur (and no one else complains too loud) then the following should go into 2.6.13, to avoid the potential kernel oops that Hawkes reported in Dinakar's feature to allow user control of dynamic sched domain placement using cpu_exclusive cpusets. I agree this is the way to go for 2.6.13 before we fix things the right way for 2.6.14. Thanks for the patch Paul. This patch should allow proceeding with this new feature in 2.6.13 for the configurations in which it is useful (node alligned sched domains) while avoiding trying to setup sched domains in the less useful cases that can cause the kernel corruption and oops. Dunno if it is something in my setup (4 CPU Power5 box with NUMA enabled) but this patch causes some hard hangs when I run the attached script. The same script runs for much longer with Ingo's changes but panics as I had described earlier. I am still debugging what causes this. -Dinakar sd-stress.tar.gz Description: GNU Zip compressed data
Re: [PATCH 2/2] ipr: Block config access during BIST (resend)
Greg, Please apply along with the previous pci patch. Thanks -- Brian King eServer Storage I/O IBM Linux Technology Center IPR scsi adapter have an exposure today in that they issue BIST to the adapter to reset the card. If, during the time it takes to complete BIST, userspace attempts to access PCI config space, the host bus bridge will master abort the access since the ipr adapter does not respond on the PCI bus for a brief period of time when running BIST. On PPC64 hardware, this master abort results in the host PCI bridge isolating that PCI device from the rest of the system, making the device unusable until Linux is rebooted. This patch makes use of some newly added PCI layer APIs that allow for protection from userspace accessing config space of a device in scenarios such as this. Signed-off-by: Brian King [EMAIL PROTECTED] --- linux-2.6-bjking1/drivers/scsi/ipr.c |2 ++ 1 files changed, 2 insertions(+) diff -puN drivers/scsi/ipr.c~ipr_block_user_config_io_during_bist drivers/scsi/ipr.c --- linux-2.6/drivers/scsi/ipr.c~ipr_block_user_config_io_during_bist 2005-08-22 17:03:57.0 -0500 +++ linux-2.6-bjking1/drivers/scsi/ipr.c2005-08-22 17:03:57.0 -0500 @@ -4944,6 +4944,7 @@ static int ipr_reset_restore_cfg_space(s int rc; ENTER; + pci_unblock_user_cfg_access(ioa_cfg-pdev); rc = pci_restore_state(ioa_cfg-pdev); if (rc != PCIBIOS_SUCCESSFUL) { @@ -4998,6 +4999,7 @@ static int ipr_reset_start_bist(struct i int rc; ENTER; + pci_block_user_cfg_access(ioa_cfg-pdev); rc = pci_write_config_byte(ioa_cfg-pdev, PCI_BIST, PCI_BIST_START); if (rc != PCIBIOS_SUCCESSFUL) { _
Re: [2.4.31] - USB device numbering in /proc/bus/usb
On Tue, 23 Aug 2005 15:14:38 +0200 Paul Rolland wrote: I've just rebooted a machine, and the eagle ADSL modem I was using, presented as /proc/bus/usb/002/005 in now presented as /proc/bus/usb/002/003 (same bus, but device ID changed from 5 to 3). Is this an expected behavior, when running a 2.4.31 kernel ? Yes. Addresses for USB devices are assigned dynamically. If you disconnect the modem from USB and connect it again, its address will change. I would have been expecting some more stability in the numbering across reboot, the same way IDE disks numbers are stable. Use some other identifier which is stable - e.g., serial number of the USB device (unfortunately, many devices don't have it). pgpWDerdwRRlJ.pgp Description: PGP signature
Re: dnotify/inotify and vfs questions
Asser Femø wrote: According to the fcntl manual you can cancel a notification by doing fcntl(fd, F_NOTIFY, 0) (ie. sending 0 as the notification mask), but looking in the kernel code fcntl_dirnotify() immediately calls dnotify_flush() with neither telling the vfs module about it. Is there a reason for this? Otherwise I'd propose calling filp-f_op-dir_notify(filp, 0) at some point in this scenario. Regarding inotify, inotify_add_watch doesn't seem to pass on the request either, which works fine for local filesystem operations as they call fsnotify_* functions every time, but that isn't really feasible for filesystems like cifs because we'd have to request change notification on everything. Is there plans for implementing a mechanism to let vfs modules get watch requests too? On a related note: dnotify and inotify on local filesystems appear to be synchronous, in the following rather useful sense: If you have previously registered for inotify/dnotify events that will catch a change to a file, and called stat() on the file, then the following operation: receive some request... stat_info = stat(file) may be replaced in userspace code with: receive some request... if (any_dnotify_or_inotify_events_pending) { read_dnotify_or_inotify_events(); if (any_events_related_to(file)) { store_in_userspace_stat_cache(file, stat(file)); } } stat_info = lookup_userspace_stat_cache(file); Now that's a silly way to save one system call in the fast path by itself. But when the stat_info is a prerequisite for validating cached data -- such as the contents of a file parsed into a data structure -- it can save a lot of system calls and logical work. For example, an Apache-style path walk which checks for .htaccess, or a Samba-style path walk which is checking for unsafe symbolic links, can be reduced from say 20 system calls to zero using this method. Pre-compiled or pre-parsed programs/scripts/templates/config-files where all the source files used are prerequisites for invalidating a cached compiled form, reduces from say 40 system calls to stat() all the source files, to zero that's quite a saving. It's not just reducing system calls. The logical tests in userspace are also skipped, if coded properly, facilitating very quick decisions about things that depend on files which mostly don't change. (Cascading structured cache prerequisites...mmm). Remote dnotify/inotify doesn't _necessarily_ have this synchronous property. It may do in some cases, depending on the implementation (this is subtle...). So, it would be nice if there was a way to query this... rather than the tedious method of testing the filesystem type and having a table of known local filesystem types where it's safe to depend on this property. Alternatively, a way to specify at dnotify/inotify creation type that synchronous notifications are required, and have the request rejected if those can't be provided. -- Jamie - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
kernel BUG at kernel/workqueue.c:104!
Hi, i get this a lot now when doing: rmmod cp2101 io_edgeport I try and do the rmmod, because i loose comunications on the USB to RS-232 adapters. Not sure if i did the ksymoops correctly but here it is: # ./ksymoops ksymoops 2.4.9 on i686 2.6.12-gentoo-r9. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.6.12-gentoo-r9/ (default) -m /usr/src/linux/System.map (default) Warning: You did not tell me where to find symbol information. I will assume that the log matches the kernel and modules that are running right now and I'll use the default options above for symbol resolution. If the current kernel and/or modules do not match the log, you can get more accurate output by telling me the kernel version and where to find map, modules, ksyms etc. ksymoops -h explains the options. Error (regular_file): read_ksyms stat /proc/ksyms failed ./ksymoops: No such file or directory No modules in ksyms, skipping objects No ksyms, skipping lsmod Reading Oops report from the terminal [ cut here ] kernel BUG at kernel/workqueue.c:104! invalid operand: [#1] PREEMPT Modules linked in: cp2101 io_edgeport ipv6 nfs lockd sunrpc ohci_hcd analog ns558 parport_pc parport pcspkr rtc nvidia via_rhine mii snd_via82xx gameport snd_ac97_codec snd_mpu401_uart snd_rawmidi i2c_viapro i2c_core ehci_hcd usbserial uhci_hcd lpclinux via_agp agpgart usbcore CPU:0 EIP:0060:[c0125213]Tainted: P VLI EFLAGS: 00210213 (2.6.12-gentoo-r9) EIP is at queue_work+0x73/0x80 eax: cc5e7948 ebx: cc5e7944 ecx: 0001 edx: d6de1000 esi: dffe7960 edi: ebp: d6de1000 esp: d6de1eac ds: 007b es: 007b ss: 0068 Process rmmod (pid: 11256, threadinfo=d6de1000 task=cc84e510) Stack: 0002 cdf78000 ccfa1f20 d2f29df4 e086c1fe cc5e7000 cdf78000 0083 d2f29de0 e0979020 e0979040 e0917134 d2f29de0 d2f29de0 d2f29df4 d2f29e18 d2f29df4 c0308667 d2f29df4 c041d290 e0979040 e0979088 Call Trace: [e086c1fe] usb_serial_disconnect+0x8e/0xc0 [usbserial] [e0917134] usb_unbind_interface+0x84/0x90 [usbcore] [c0308667] device_release_driver+0x77/0x80 [c03086a0] driver_detach+0x30/0x40 [c0308b3c] bus_remove_driver+0x4c/0x90 [c03090f3] driver_unregister+0x13/0x30 [e0917227] usb_deregister+0x37/0x50 [usbcore] [e097786f] cp2101_exit+0xf/0x1f [cp2101] [c012ef67] sys_delete_module+0x167/0x1a0 [c0153941] sys_write+0x51/0x80 [c0102dc1] syscall_call+0x7/0xb Code: d4 c1 fe ff b8 00 f0 ff ff 21 e0 8b 40 08 a8 08 75 12 89 f8 8b 5c 24 08 8b 74 24 0c 8b 7c 24 10 83 c4 14 c3 e8 6f 5f 2c 00 eb e7 0f 0b 68 00 2d 37 40 c0 eb b4 8d 76 00 83 ec 08 8b 44 24 0c 8b 6note: rmmod[11256] exited with preempt_count 1 scheduling while atomic: rmmod/0x1001/11256 [c03eb176] schedule+0x5f6/0x600 [c0142a9a] unmap_page_range+0x8a/0xb0 [c03eb9fc] cond_resched+0x2c/0x50 [c0142c68] unmap_vmas+0x1a8/0x200 [c01476b3] exit_mmap+0x83/0x170 [c0103ba0] do_invalid_op+0x0/0xd0 [c0112a87] mmput+0x37/0xb0 [c0117630] do_exit+0xb0/0x3d0 [c0103ba0] do_invalid_op+0x0/0xd0 [c01037db] die+0x18b/0x190 [c0103c4e] do_invalid_op+0xae/0xd0 [c030b3c6] pool_find_page+0x46/0x70 [c0125213] queue_work+0x73/0x80 [c030b46b] dma_pool_free+0x7b/0x112 [e091d580] urb_destroy+0x0/0x10 [usbcore] [e091dbed] usb_start_wait_urb+0xcd/0xf0 [usbcore] [e0966bf4] qh_destroy+0x54/0x80 [ehci_hcd] [e0966ba0] qh_destroy+0x0/0x80 [ehci_hcd] [c0299e1d] kref_put+0x3d/0xa0 [e096b444] ehci_endpoint_disable+0x124/0x172 [ehci_hcd] [e0966ba0] qh_destroy+0x0/0x80 [ehci_hcd] [c0102fdb] error_code+0x4f/0x54 [c0125213] queue_work+0x73/0x80 [e086c1fe] usb_serial_disconnect+0x8e/0xc0 [usbserial] [e0917134] usb_unbind_interface+0x84/0x90 [usbcore] [c0308667] device_release_driver+0x77/0x80 [c03086a0] driver_detach+0x30/0x40 [c0308b3c] bus_remove_driver+0x4c/0x90 [c03090f3] driver_unregister+0x13/0x30 [e0917227] usb_deregister+0x37/0x50 [usbcore] [e097786f] cp2101_exit+0xf/0x1f [cp2101] [c012ef67] sys_delete_module+0x167/0x1a0 [c0153941] sys_write+0x51/0x80 [c0102dc1] syscall_call+0x7/0xb kernel BUG at kernel/workqueue.c:104! invalid operand: [#1] CPU:0 EIP:0060:[c0125213]Tainted: P VLI Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00210213 (2.6.12-gentoo-r9) eax: cc5e7948 ebx: cc5e7944 ecx: 0001 edx: d6de1000 esi: dffe7960 edi: ebp: d6de1000 esp: d6de1eac ds: 007b es: 007b ss: 0068 Stack: 0002 cdf78000 ccfa1f20 d2f29df4 e086c1fe cc5e7000 cdf78000 0083 d2f29de0 e0979020 e0979040 e0917134 d2f29de0 d2f29de0 d2f29df4 d2f29e18 d2f29df4 c0308667 d2f29df4 c041d290 e0979040 e0979088 Call Trace: [e086c1fe] usb_serial_disconnect+0x8e/0xc0 [usbserial] [e0917134] usb_unbind_interface+0x84/0x90 [usbcore] [c0308667] device_release_driver+0x77/0x80 [c03086a0] driver_detach+0x30/0x40 [c0308b3c] bus_remove_driver+0x4c/0x90 [c03090f3] driver_unregister+0x13/0x30 [e0917227]
Re: [2.4.31] - USB device numbering in /proc/bus/usb
Hello Sergey, Yes. Addresses for USB devices are assigned dynamically. If you disconnect the modem from USB and connect it again, its address will change. The problem I've is that nothing changed on the machine except that I did a reboot. Nothing (USB device) added, nothing removed, so with a stable hardware config, USB numbering should have stayed stable, IMHO. I would have been expecting some more stability in the numbering across reboot, the same way IDE disks numbers are stable. Use some other identifier which is stable - e.g., serial number of the USB device (unfortunately, many devices don't have it). Well yes, I'm going to try to convert to some other identifiers space as this seems to be the only way to go. Thanks for the confirmation, Regards, Paul - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13-rc6-mm2
Yup, seems to be generally good... Noticed this in the log earlier tonight: Aug 23 19:44:51 tornado kernel: hub 5-0:1.0: port 1 disabled by hub (EMI?), re-enabling... Aug 23 19:44:51 tornado kernel: usb 5-1: USB disconnect, address 2 Aug 23 19:44:51 tornado kernel: drivers/usb/class/usblp.c: usblp0: removed Aug 23 19:44:51 tornado kernel: Unable to handle kernel NULL pointer dereference at virtual address 0004 Aug 23 19:44:51 tornado kernel: printing eip: Aug 23 19:44:51 tornado kernel: c01ccef2 Aug 23 19:44:51 tornado kernel: *pde = Aug 23 19:44:51 tornado kernel: Oops: [#1] Aug 23 19:44:51 tornado kernel: SMP Aug 23 19:44:51 tornado kernel: last sysfs file: /devices/pci:00/:00:1f.3/i2c-0/name Aug 23 19:44:51 tornado kernel: Modules linked in: nfsd exportfs lockd eeprom sunrpc ipv6 iptable_filter binfmt_misc reiser4 zlib_de flate zlib_inflate dm_mod video thermal processor fan button ac tpm_nsc i2c_i801 sky2 e100 sr_mod Aug 23 19:44:51 tornado kernel: CPU:1 Aug 23 19:44:51 tornado kernel: EIP:0060:[c01ccef2]Not tainted VLI Aug 23 19:44:51 tornado kernel: EFLAGS: 00010286 (2.6.13-rc6-mm2) Aug 23 19:44:51 tornado kernel: EIP is at _raw_spin_lock+0x7/0x73 Aug 23 19:44:51 tornado kernel: eax: ebx: ecx: c1a60658 edx: c1a63e24 Aug 23 19:44:51 tornado kernel: esi: edi: c0382400 ebp: f7c55e98 esp: f7c55e90 Aug 23 19:44:51 tornado kernel: ds: 007b es: 007b ss: 0068 Aug 23 19:44:51 tornado kernel: Process khubd (pid: 109, threadinfo=f7c54000 task=c192b030) Aug 23 19:44:51 tornado kernel: Stack: f7c58a8c f7c55ea0 c0312219 f7c55eb0 c030feb7 f7c58ae8 f7c58a48 Aug 23 19:44:51 tornado kernel:f7c55ec4 c0217e73 f7c58a48 f7d134ec 0040 f7c55ed0 c0217ec0 f7c58a48 Aug 23 19:44:51 tornado kernel:f7c55edc c0217814 f7c58a48 f7c55eec c0216ad2 f7c58a48 f7c58a14 f7c55ef8 Aug 23 19:44:51 tornado kernel: Call Trace: Aug 23 19:44:51 tornado kernel: [c01039c3] show_stack+0x94/0xca Aug 23 19:44:51 tornado kernel: [c0103b6c] show_registers+0x15a/0x1ea Aug 23 19:44:51 tornado kernel: [c0103d8a] die+0x108/0x183 Aug 23 19:44:51 tornado kernel: [c031295a] do_page_fault+0x1ea/0x63d Aug 23 19:44:51 tornado kernel: [c0103693] error_code+0x4f/0x54 Aug 23 19:44:51 tornado kernel: [c0312219] _spin_lock+0x8/0xa Aug 23 19:44:51 tornado kernel: [c030feb7] klist_remove+0x10/0x2c Aug 23 19:44:51 tornado kernel: [c0217e73] __device_release_driver+0x41/0x65 Aug 23 19:44:51 tornado kernel: [c0217ec0] device_release_driver+0x29/0x39 Aug 23 19:44:51 tornado kernel: [c0217814] bus_remove_device+0x52/0x60 Aug 23 19:44:51 tornado kernel: [c0216ad2] device_del+0x2e/0x5d Aug 23 19:44:51 tornado kernel: [c0216b0c] device_unregister+0xb/0x15 Aug 23 19:44:51 tornado kernel: [c0275d67] usb_disconnect+0x115/0x15c Aug 23 19:44:51 tornado kernel: [c0276b85] hub_port_connect_change+0x54/0x399 Aug 23 19:44:51 tornado kernel: [c027713e] hub_events+0x274/0x3b2 Aug 23 19:44:51 tornado kernel: [c0277296] hub_thread+0x1a/0xdf Aug 23 19:44:51 tornado kernel: [c012fba7] kthread+0x99/0x9d Aug 23 19:44:51 tornado kernel: [c01010b5] kernel_thread_helper+0x5/0xb Aug 23 19:44:51 tornado kernel: Code: 00 00 00 8b 0d a8 62 36 c0 e9 61 ff ff ff f3 90 31 c0 86 07 84 c0 0f 8e 79 ff ff ff 83 c4 18 5 b 5e 5f 5d c3 55 89 e5 56 53 89 c3 81 78 04 ad 4e ad de 75 2d be 00 e0 ff ff 21 e6 8b 06 39 43 0c this one is my fault, caused by driver-core-fix-bus_rescan_devices-race.patch problem is that USB is direclty messing with dev-driver and then calling device_bind_driver() if the device is not already bound... i think the correct solution would be a sane API here and disallow direct messing with dev-driver...meanwhile the attached patch will do. messing directly with dev-driver is especially bad if it's already set to another driver. this leads to problems later in device_release_driver(). akpm: please replace driver-core-fix-bus_rescan_devices-race.patch with the attached one. rgds -daniel --- [PATCH] driver core: fix bus_rescan_devices() race. bus_rescan_devices_helper() does not hold the dev-sem when it checks for !dev-driver. device_attach() holds the sem, but calls again device_bind_driver() even when dev-driver is set. what happens is that a first device_attach() call (module insertion time) is on the way binding the device to a driver. another thread calls bus_rescan_devices(). now when bus_rescan_devices_helper() checks for dev-driver it is still NULL 'cos the the prior device_attach() is not yet finished. but as soon as the first one releases the dev-sem the second device_attach() tries to rebind the already bound device again. device_bind_driver() does this blindly which leads to a corrupt driver-klist_devices list (the device links itself, the head points to the device). later a call to device_release_driver() sets dev-driver to NULL and breaks the link it has to
RE: kernel module seg fault
Hi, This is the code where i am getting this problem. static byte4 VNICClientStart(unsigned long arg) { VNICClientCfgCreateInfo_t clientConfig; struct socket*sock = NULL; ubyte4 status = 0; ubyte4 retryCnt = VNIC_CLIENT_MAX_CONN_RETRY_CNT; ubyte4 ret= 0; byte4len= 0; struct net_device*dev = NULL; VNICConnMap_t*connMap= NULL; byte4error = 0; VNICHdrForm_t vnicHdr; VNICVirtMirrIfaceAndServIPList_t *ifaceIPNode = NULL; DECLARE_WAIT_QUEUE_HEAD(wq); init_waitqueue_head(wq); EnterFunction(VNICClientStart); memset(vnicHdr, 0, sizeof(vnicHdr)); while (retryCnt) { --retryCnt; if (!retryCnt) { return VNIC_CLIENT_SERVER_RESPONSE_TIMEOUT; } /* wait for small */ interruptible_sleep_on_timeout(wq, 2); } /* end while (retryCnt)*/ LeaveFunction(VNICClientStart); return VNIC_CLIENT_SERVER_SUCCESS; /* for success */ } /* end VNICClientStart() */ I commneted out all the other functionalities of this function to make it simple but still it is getting kernel panic. This function gets called when i invoke ioctl() from my user application and gets kernel panic. Regards, Manomugdha --- [EMAIL PROTECTED] wrote: Hi Biswas, You need to post the complete kernel dump message and body of your source code. -Bunnan -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of manomugdha biswas Sent: Tuesday, August 23, 2005 3:13 PM To: linux-kernel@vger.kernel.org Subject: kernel module seg fault Hi, I have written a kernel module and I can load (insmod) it without any error. But when i run my module it gets seg fault at interruptible_sleep_on_timeout(); I have used this function in the following way: DECLARE_WAIT_QUEUE_HEAD(wq); init_waitqueue_head(wq); interruptible_sleep_on_timeout(wq, 2); I am using redhat version 9.0 and kernel version 2.4.20-8. Could you please give some light on this issue? Manomugdha Biswas Send a rakhi to your brother, buy gifts and win attractive prizes. Log on to http://in.promos.yahoo.com/rakhi/index.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ Manomugdha Biswas Send a rakhi to your brother, buy gifts and win attractive prizes. Log on to http://in.promos.yahoo.com/rakhi/index.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fix send_sigqueue() vs thread exit race
Thomas Gleixner wrote: On Mon, 2005-08-22 at 20:45 +0400, Oleg Nesterov wrote: kernel/posix-timers.c:common_timer_del() calls del_timer_sync(), after that nobody can access this timer, so we don't need to lock timer-it_lock at all in this case. No lock - no deadlock. It still deadlocks: CPU 0 CPU 1 write_lock(tasklist_lock); __exit_signal() timer expires base-running_timer = timer send_group_sigqueue() read_lock(tasklist_lock(); exit_itimers() del_timer_sync(timer) waits for ever because waits for ever on tasklist_lock base-running_timer == timer Silly me. I still think the last patch I sent is still necessary. Thomas, you know that I like this change in __exit_{signal,sighand}, but i think this change is dangerous, should go in a separate patch, and needs a lot of testing. But the decision is up to Ingo and Roland. I am looking at your previous patch: - read_lock(tasklist_lock); +retry: + if (unlikely(p-flags PF_EXITING)) + return -1; + + if (unlikely(!read_trylock(tasklist_lock))) { + cpu_relax(); + goto retry; + } + if (unlikely(p-flags PF_EXITING)) { + ret = -1; + goto out_err; What do you think about this: int try_to_lock_this_beep_tasklist_lock(struct task_struct *group_leader) { while (unlikely(!read_trylock(tasklist_lock))) { if (group_leader-flags PF_EXITING) { smp_rmb(); if (thread_group_empty(group_leader)) return 0; } cpu_relax(); } return 1; } No need to re-check after we got tasklist, the signal will be flushed. I think it's better to move the locking into the posix_timer_event, btw. In that case we can drop my patch. What is your opinion, can it work? P.S. Thomas, thanks for explanation about posix-cpu-timers. Oleg. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: irq 11: nobody cared
Nigel Rantor wrote: Hail, I posted a report a while back, no answer. Who should I be talking to wrt to the irq 11: nobody cared issue? I'm happy to provide as much info as possible but need to know what info is required. I'm happily running 2.6.7, tried the latest and greatest (2.6.12) and found the problem, then started by looking at 2.6.8 and found the problem there too. It happens on boot, is a showstopper and I'm wondering what, if anything useful I can provide you guys. Throw me a bone... Read REPORTING-BUGS. We can't do much of anything with this report. Tell us what's on irq 11, for starters Jeff - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: some missing spin_unlocks
From: Ted Unangst [EMAIL PROTECTED] Date: Mon, 22 Aug 2005 15:26:47 -0700 net/rose/rose_route.c rose_route_frame, line 998 returns without unlocking rose_node_list_lock, rose_neigh_list_lock, or rose_route_list_lock I fixed this one with the patch below. net/rose/rose_timer.c rose_heartbeat_expiry, line 141 rose_destroy_socket does not unlock sk as far as i can see This one needs more care. We can't drop the lock, because the destroy actions need to be protected by that lock, but we can't release the lock after rose_destroy_socket() because the object may not even exist any longer. The problem there, at the core, is that the timer doesn't grab a reference to the socket, which would make the solution to this bug very straight forward. Someone should work on that :-) diff-tree 61ef36aa6cf356649863a24a850c2183cb762c61 (from daf53344fadaa8c47c6b0864e7f34efcbb66e391) Author: David S. Miller [EMAIL PROTECTED] Date: Tue Aug 23 09:42:38 2005 -0700 [ROSE]: Fix missing unlocks in rose_route_frame() Noticed by Coverity checker. Signed-off-by: David S. Miller [EMAIL PROTECTED] diff --git a/net/rose/rose_route.c b/net/rose/rose_route.c --- a/net/rose/rose_route.c +++ b/net/rose/rose_route.c @@ -994,8 +994,10 @@ int rose_route_frame(struct sk_buff *skb * 1. The frame isn't for us, * 2. It isn't owned by any existing route. */ - if (frametype != ROSE_CALL_REQUEST) /* XXX */ - return 0; + if (frametype != ROSE_CALL_REQUEST) { /* XXX */ + ret = 0; + goto out; + } len = (((skb-data[3] 4) 0x0F) + 1) / 2; len += (((skb-data[3] 0) 0x0F) + 1) / 2; - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux AIO status todo
On Tue, Aug 23, 2005 at 05:56:09AM -0400, Jakub Jelinek wrote: POSIX AIO needs to handle SIGEV_NONE, SIGEV_SIGNAL and SIGEV_THREAD notification. Obviously kernel shouldn't create threads for SIGEV_THREAD itself, as kernel shouldn't hardcode all the implementation details how a thread can be created. But it would be good if AIO signalling e.g. handled both SIGEV_SIGNAL and SIGEV_SIGNAL | SIGEV_THREAD_ID, with the same usage as e.g. timer_* syscalls. If kernel makes sure SI_ASYNCIO si_code is set in the notification signal siginfos, glibc could even use just one helper thread for timer_*/[al]io_* and maybe in the future other SIGEV_THREAD notification. The signal patch from Sebastien should handle the SIGEV_foo. The patch at http://www.kvack.org/~bcrl/patches/aio-2.6.13-rc6-B1/817_sigevent.diff has the latest changes from me and should do what is needed. -ben - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: some missing spin_unlocks
This one needs more care. We can't drop the lock, because the destroy actions need to be protected by that lock, but we can't release the lock after rose_destroy_socket() because the object may not even exist any longer. does it matter? can ANYTHING be spinning on the lock? if not .. can we just let the lock go poof and not unlock it... - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13-rc6-mm2 - fs/xfs/xfs*.c warnings
i'm compiling 2.6.13-rc6-mm2 atm and noticed that xfs is having lots of warnings while compiling. recently i switched to gcc 4.0.1 - maybe it's because of this. details: fs/xfs/xfs_acl.c: In function 'xfs_acl_access': fs/xfs/xfs_acl.c:445: warning: 'matched.ae_perm' may be used uninitialized in this function fs/xfs/xfs_alloc_btree.c: In function 'xfs_alloc_insrec': fs/xfs/xfs_alloc_btree.c:622: warning: 'nrec.ar_startblock' may be used uninitialized in this function fs/xfs/xfs_alloc_btree.c:622: warning: 'nrec.ar_blockcount' may be used uninitialized in this function fs/xfs/xfs_bmap.c: In function 'xfs_bmap_alloc': fs/xfs/xfs_bmap.c:2335: warning: 'rtx' is used uninitialized in this function fs/xfs/xfs_dir2_sf.c: In function 'xfs_dir2_block_sfsize': fs/xfs/xfs_dir2_sf.c:110: warning: 'parent' may be used uninitialized in this function fs/xfs/xfs_dir_leaf.c: In function 'xfs_dir_leaf_to_shortform': fs/xfs/xfs_dir_leaf.c:653: warning: 'parent' may be used uninitialized in this function fs/xfs/xfs_ialloc_btree.c: In function 'xfs_inobt_insrec': fs/xfs/xfs_ialloc_btree.c:750: warning: 'nrec.ir_free' is used uninitialized in this function fs/xfs/xfs_ialloc_btree.c:750: warning: 'nrec.ir_freecount' is used uninitialized in this function fs/xfs/xfs_ialloc_btree.c:567: warning: 'nrec.ir_startino' may be used uninitialized in this function and the following warning appears a lot of times: fs/xfs/xfs_bmap_btree.h:508:21: warning: __BIG_ENDIAN is not defined fs/xfs/xfs_bmap_btree.h:626:21: warning: __BIG_ENDIAN is not defined just giving a heads-up if somebody wants to clean this code. thanx + greetings, Damir Le Tuesday 23 August 2005 06:30, Andrew Morton a écrit : | ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc |6/2.6.13-rc6-mm2/ | | - Various updates. Nothing terribly noteworthy. | | - This kernel still spits a bunch of scheduling-while-atomic warnings | from the scsi code. Please ignore. | -- It is impossible for an optimist to be pleasantly surprised. pgpIJ9o61dnLw.pgp Description: PGP signature
Re: 2.6.12 Performance problems
--- Helge Hafting [EMAIL PROTECTED] wrote: Danial Thom wrote: --- Jesper Juhl [EMAIL PROTECTED] wrote: On 8/21/05, Danial Thom [EMAIL PROTECTED] wrote: I just started fiddling with 2.6.12, and there seems to be a big drop-off in performance from 2.4.x in terms of networking on a uniprocessor system. Just bridging packets through the machine, 2.6.12 starts dropping packets at ~100Kpps, whereas 2.4.x doesn't start dropping until over 350Kpps on the same hardware (2.0Ghz Opteron with e1000 driver). This is pitiful prformance for this hardware. I've increased the rx ring in the e1000 driver to 512 with little change (interrupt moderation is set to 8000 Ints/second). Has tuning for MP destroyed UP performance altogether, or is there some tuning parameter that could make a 4-fold difference? All debugging is off and there are no messages on the console or in the error logs. The kernel is the standard kernel.org dowload config with SMP turned off and the intel ethernet card drivers as modules without any other changes, which is exactly the config for my 2.4 kernels. If you have preemtion enabled you could disable it. Low latency comes at the cost of decreased throughput - can't have both. Also try using a HZ of 100 if you are currently using 1000, that should also improve throughput a little at the cost of slightly higher latencies. I doubt that it'll do any huge difference, but if it does, then that would probably be valuable info. Ok, well you'll have to explain this one: Low latency comes at the cost of decreased throughput - can't have both Configuring preempt gives lower latency, because then almost anything can be interrupted (preempted). You can then get very quick responses to some things, i.e. interrupts and such. I think part of the problem is the continued misuse of the word latency. Latency, in language terms, means unexplained delay. Its wrong here because for one, its explainable. But it also depends on your perspective. The latency is increased for kernel tasks, while it may be reduced for something that is getting the benefit of preempting the kernel. So you really can't say the price of reduced latency is lower throughput, because thats simply backwards. You've increased the kernel tasks latency by allowing it to be pre-empted. Reduced latency implies higher efficiency. All you've done here is shift the latency from one task to another, so there is no reduction overall, in fact there is probably a marginal increase due to the overhead of pre-emption vs doing nothing. DT Start your day with Yahoo! - make it your home page http://www.yahoo.com/r/hs - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.12 Performance problems
Danial Thom wrote: I think part of the problem is the continued misuse of the word latency. Latency, in language terms, means unexplained delay. Its wrong here because for one, its explainable. But it also depends on your perspective. The latency is increased for kernel tasks, while it may be reduced for something that is getting the benefit of preempting the kernel. So you really can't say the price of reduced latency is lower throughput, because thats simply backwards. You've increased the kernel tasks latency by allowing it to be pre-empted. Reduced latency implies higher efficiency. All you've done here is shift the latency from one task to another, so there is no reduction overall, in fact there is probably a marginal increase due to the overhead of pre-emption vs doing nothing. If instead of complaining you would provide the information I've asked for two days ago someone might actually be able to help you. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13-rc6-mm2 - drivers/net/s2io.o failed building
2.6.13-rc6-mm2 failed building with this problem (gcc 4.0.1): CC [M] drivers/net/s2io.o In file included from drivers/net/s2io.c:65: drivers/net/s2io.h: In function 'readq': drivers/net/s2io.h:765: error: invalid lvalue in assignment drivers/net/s2io.h:766: error: invalid lvalue in assignment make[2]: *** [drivers/net/s2io.o] Error 1 make[1]: *** [drivers/net] Error 2 make: *** [drivers] Error 2 == ERROR: Build Failed. Aborting... greetings, Damir Le Tuesday 23 August 2005 06:30, vous avez écrit : | - Various updates. Nothing terribly noteworthy. | | - This kernel still spits a bunch of scheduling-while-atomic warnings | from the scsi code. Please ignore. -- Never give in. Never give in. Never. Never. Never. -- Winston Churchill pgpEkVtqfaK6M.pgp Description: PGP signature
Re: select() efficiency / epoll
So, I've been trying to use epoll.. on linux-2.6.11-6mdk However, I'm getting segfaults because some pointers in places are getting set to low integer values (which didn't used to have those values). The deal is that my application is multi-threaded, and I was wondering if epoll had issues if you use epoll_ctl while an epoll_wait is waiting or something like that. I'm also compiling with -D_MULTI_THREADED. I'm not new to threading, but am stumped at this point. I'm not ruling out it being my code, but wanted to ask about epoll since it's so new. Any ideas? Thanks, Davy bert hubert wrote: On Fri, Jul 22, 2005 at 04:18:46PM -0500, Davy Durham wrote: Please forgive and redirect me if this is not the right place to ask this question: I'm looking to write a sort of messaging system that would take input from any number of entities that register with it.. it would then route the messages to outputs and so forth.. Look at epoll, or libevent, which uses epoll to be quick in this scenario. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Console Resolution Support Request - 1024x480
Hey, After seeing many many posts and no solutions anywhere regarding having a full screen console on a Sony Vaio Picturebook with an ATI Rage Mobility video chip on a kernel anywhere near the current version~inhale~I finally made an attempt with a kernel-2.4.17 diff patch to manually change the nessecery source code in order to make 1024x480 supported. If there is any info that you might need I might be able to get it for you but my experience is limited unfortunately. My attempt to fix this manually was a failure. I managed to find all the code and made all of the proper adjustments, but when I made the adjustments to ../drivers/video/aty/mach64_ct.c there were problems with pll and mpostdiv being undefined. I attempted anxiouslly to work around this by leaving the code as it was originally or even replacing the nessecery code to make it similar (by theory), but there was no success. It did compile, however after booting into the kernel with vga=0x301 the screen was terribly unreadable, missized (large), and flashing. I wish I could manage this on my own, but I'm not capable and have made no progress. I'm surprised that no one has taken the older patch and implemented it into the kernel so this would not have been an issue except in 2.4.17 and earlier. I do know that if I use Windows XP ont his machine, in order to get fullscreen usage I needed to use NeoMagic drivers (if you need these, contact me for the exact drivers that made mine work) when my video card is in fact an ATI Rage Mobility chip. Please, even if you can't help with this, give me some information that may lead to a positive outcome for any/all picturebook users. Thank you in advance! Note: I attempted this fix on linux-2.4.28-r9 kernel. If you would like to see my attempted patch I will show you that as well, it works fantastic on terms of patching but the code however does not work. Here is the patch for the 2.4.17 kernel: Code: diff -Nur linux-2.4.17/drivers/video/Config.in linux/drivers/video/Config.in --- linux-2.4.17/drivers/video/Config.in Thu Nov 15 10:16:31 2001 +++ linux/drivers/video/Config.in Fri Jan 11 16:13:37 2002 @@ -135,6 +135,9 @@ if [ $CONFIG_FB_ATY != n ]; then bool 'Mach64 GX support (EXPERIMENTAL)' CONFIG_FB_ATY_GX bool 'Mach64 CT/VT/GT/LT (incl. 3D RAGE) support' CONFIG_FB_ATY_CT + if [ $CONFIG_FB_ATY_CT = y ]; then + bool ' Sony Vaio C1VE 1024x480 LCD support' CONFIG_FB_ATY_CT_VAIO_LCD + fi fi tristate ' ATI Radeon display support (EXPERIMENTAL)' CONFIG_FB_RADEON tristate ' ATI Rage128 display support (EXPERIMENTAL)' CONFIG_FB_ATY128 diff -Nur linux-2.4.17/drivers/video/aty/atyfb_base.c linux/drivers/video/aty/atyfb_base.c --- linux-2.4.17/drivers/video/aty/atyfb_base.c Fri Dec 21 21:37:11 2001 +++ linux/drivers/video/aty/atyfb_base.c Sat Dec 22 02:39:12 2001 @@ -353,6 +353,7 @@ /* 3D RAGE Mobility */ { 0x4c4d, 0x4c4d, 0x00, 0x00, m64n_mob_p, 230, 50, M64F_GT | M64F_INTEGRATED | M64F_RESET_3D | M64F_GTB_DSP | M64F_MOBIL_BUS }, +{ 0x4c52, 0x4c52, 0x00, 0x00, m64n_mob_p, 230, 40, M64F_GT | M64F_INTEGRATED | M64F_RESET_3D | M64F_GTB_DSP | M64F_MOBIL_BUS | M64F_MAGIC_POSTDIV | M64F_SDRAM_MAGIC_PLL | M64F_XL_DLL }, { 0x4c4e, 0x4c4e, 0x00, 0x00, m64n_mob_a, 230, 50, M64F_GT | M64F_INTEGRATED | M64F_RESET_3D | M64F_GTB_DSP | M64F_MOBIL_BUS }, #endif /* CONFIG_FB_ATY_CT */ }; @@ -423,7 +424,7 @@ #endif /* defined(CONFIG_PPC) */ -#if defined(CONFIG_PMAC_PBOOK) || defined(CONFIG_PMAC_BACKLIGHT) +#if defined(CONFIG_PMAC_PBOOK) || defined(CONFIG_PMAC_BACKLIGHT) || defined(CONFIG_FB_ATY_CT_VAIO_LCD) static void aty_st_lcd(int index, u32 val, const struct fb_info_aty *info) { unsigned long temp; @@ -445,7 +446,7 @@ /* read the register value */ return aty_ld_le32(LCD_DATA, info); } -#endif /* CONFIG_PMAC_PBOOK || CONFIG_PMAC_BACKLIGHT */ +#endif /* CONFIG_PMAC_PBOOK || CONFIG_PMAC_BACKLIGHT || CONFIG_FB_ATY_CT_VAIO_LCD */ /* - */ @@ -1744,6 +1745,9 @@ #if defined(CONFIG_PPC) int sense; #endif +#if defined(CONFIG_FB_ATY_CT_VAIO_LCD) +u32 pm, hs; +#endif u8 pll_ref_div; info-aty_cmap_regs = (struct aty_cmap_regs *)(info-ati_regbase+0xc0); @@ -2068,6 +2072,35 @@ var = default_var; #endif /* !__sparc__ */ #endif /* !CONFIG_PPC */ +#if defined(CONFIG_FB_ATY_CT_VAIO_LCD) + /* Power Management */ + pm=aty_ld_lcd(POWER_MANAGEMENT, info); + pm=(pm ~PWR_MGT_MODE_MASK) | PWR_MGT_MODE_PCI; + pm|=PWR_MGT_ON; +
Re: some missing spin_unlocks
From: Arjan van de Ven [EMAIL PROTECTED] Subject: Re: some missing spin_unlocks Date: Tue, 23 Aug 2005 19:40:06 +0200 On Tue, 2005-08-23 at 10:30 -0700, David S. Miller wrote: From: Arjan van de Ven [EMAIL PROTECTED] Date: Tue, 23 Aug 2005 18:54:03 +0200 does it matter? can ANYTHING be spinning on the lock? if not .. can we just let the lock go poof and not unlock it... I believe socket lookup can, otherwise the code is OK as-is. lookup while the object is in progress of being destroyed sounds really bad though This happens all the time with TCP sockets, for example. When we're trying to kill off a socket which is in time wait state, the receive path can find it, grab a reference, and process a packet against it right as we're trying to kill it off. This is completely normal. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: some missing spin_unlocks
On Tue, 2005-08-23 at 10:30 -0700, David S. Miller wrote: From: Arjan van de Ven [EMAIL PROTECTED] Date: Tue, 23 Aug 2005 18:54:03 +0200 does it matter? can ANYTHING be spinning on the lock? if not .. can we just let the lock go poof and not unlock it... I believe socket lookup can, otherwise the code is OK as-is. lookup while the object is in progress of being destroyed sounds really bad though - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: CONFIG_PRINTK_TIME woes
David S. Miller wrote: This is a useful feature, please do not labotomize it just because it's difficult to implement on ia64. Just make a printk_get_timestamp_because_ia64_sucks() interface or something like that :-) I was a bit unclear when I raised this issue. It is not just an ia64 problem. The sched_clock() interface is allowed to return wildly different values depending on which CPU it is called from, and currently has fundamental problems at least on i386 where it can go fowards and backwards arbitrary amounts of time (due to frequency scaling, if I understand correctly), and also needn't be exactly nanoseconds at the best of times. The interface is like this so it can be per-cpu and lockless and as fast as possible for the scheduler heuristics (which aren't too picky). I just don't want its usage spreading outside kernel/sched.c if we can help it. Pragmatically it sounds like the best thing we have for printk at this time, however I hope we can come up with something slightly more appropriate even if it ends up being slower. Send instant messages to your online friends http://au.messenger.yahoo.com - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.13-rc6] i386: fix incorrect FP signal delivery
On Tue, 23 Aug 2005 02:20:07 +0200, Andi Kleen wrote: every reviewer has to look up all the bits in the manual? I fixed the test program too: Before patch: $ ./fpsig handler: signum = 8, errno = 0, code = 0 [unknown] handler: fpu cwd = 0xb40, fpu swd = 0xbaa0 handler: i387 unmasked precision exception, rounded up After: $ ./fpsig handler: signum = 8, errno = 0, code = 6 [inexact result] handler: fpu cwd = 0xb40, fpu swd = 0xbaa0 handler: i387 unmasked precision exception, rounded up /* i387 fp signal test */ #define _GNU_SOURCE #include stdlib.h #include unistd.h #include stdio.h #include signal.h #include errno.h __attribute__ ((aligned(4096))) unsigned char altstack[4096]; unsigned short cw = 0x0b40; /* unmask all exceptions, round up */ struct sigaction sa; stack_t ss = { .ss_sp = altstack[2047], .ss_size = sizeof(altstack)/2, }; static void handler(int nr, siginfo_t *si, void *uc) { char *decode; int code = si-si_code; unsigned short cwd = *(unsigned short *)altstack[0xd84]; unsigned short swd = *(unsigned short *)altstack[0xd88]; switch (code) { case FPE_INTDIV: decode = divide by zero; break; case FPE_FLTRES: decode = inexact result; break; case FPE_FLTINV: decode = invalid operation; break; default: decode = unknown; break; } printf(handler: signum = %d, errno = %d, code = %d [%s]\n, si-si_signo, si-si_errno, code, decode); printf(handler: fpu cwd = 0x%hx, fpu swd = 0x%hx\n, cwd, swd); if (swd 0x20 ~cwd) printf(handler: i387 unmasked precision exception, rounded %s\n, swd 0x200 ? up : down); exit(1); } int main(int argc, char * const argv[]) { sa.sa_sigaction = handler; sa.sa_flags = SA_ONSTACK | SA_SIGINFO; if (sigaltstack(ss, 0)) perror(sigaltstack); if (sigaction(SIGFPE, sa, NULL)) perror(sigaction); asm volatile (fnclex ; fldcw %0 : : m (cw)); asm volatile ( /* st(1) = 3.0, st = 1.0 */ fld1 ; fld1 ; faddp ; fld1 ; faddp ; fld1); asm volatile ( fdivp ; fwait); /* 1.0 / 3.0 */ return 0; } __ Chuck - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.12 Performance problems
On Tue, 2005-08-23 at 10:10 -0700, Danial Thom wrote: Ok, well you'll have to explain this one: Low latency comes at the cost of decreased throughput - can't have both Configuring preempt gives lower latency, because then almost anything can be interrupted (preempted). You can then get very quick responses to some things, i.e. interrupts and such. I think part of the problem is the continued misuse of the word latency. Latency, in language terms, means unexplained delay. latency n 1: (computer science) the time it takes for a specific block of data on a data track to rotate around to the read/write head [syn: rotational latency] 2: the time that elapses between a stimulus and the response to it [syn: reaction time, response time, latent period] 3: the state of being not yet evident or active No apparent references to unexplained in association with the word latency. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/