Re: 2.6.23-rc1-mm2: MMC_ARMMMCI compile error
On Wed, 8 Aug 2007 23:31:14 +0200 Adrian Bunk <[EMAIL PROTECTED]> wrote: > > CONFIG_MMC_ARMMMCI=m/y results in the following compile error: > > <-- snip --> > > ... > CC [M] drivers/mmc/host/mmci.o > /home/bunk/linux/kernel-2.6/linux-2.6.23-rc1-mm2/drivers/mmc/host/mmci.c: > In function > 'mmci_request': > /home/bunk/linux/kernel-2.6/linux-2.6.23-rc1-mm2/drivers/mmc/host/mmci.c:398: > error: implicit declaration of function 'mmc_end_request' make[4]: > *** [drivers/mmc/host/mmci.o] Error 1 > Thanks. That wasn't the only bug in there. Hopefully fixed now. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2: MMC_ARMMMCI compile error
On Wed, 8 Aug 2007 23:31:14 +0200 Adrian Bunk [EMAIL PROTECTED] wrote: CONFIG_MMC_ARMMMCI=m/y results in the following compile error: -- snip -- ... CC [M] drivers/mmc/host/mmci.o /home/bunk/linux/kernel-2.6/linux-2.6.23-rc1-mm2/drivers/mmc/host/mmci.c: In function 'mmci_request': /home/bunk/linux/kernel-2.6/linux-2.6.23-rc1-mm2/drivers/mmc/host/mmci.c:398: error: implicit declaration of function 'mmc_end_request' make[4]: *** [drivers/mmc/host/mmci.o] Error 1 Thanks. That wasn't the only bug in there. Hopefully fixed now. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.23-rc1-mm2: MMC_ARMMMCI compile error
On Tue, Jul 31, 2007 at 11:09:32PM -0700, Andrew Morton wrote: >... > Changes since 2.6.23-rc1-mm1: >... > git-mmc.patch >... > git trees >... CONFIG_MMC_ARMMMCI=m/y results in the following compile error: <-- snip --> ... CC [M] drivers/mmc/host/mmci.o /home/bunk/linux/kernel-2.6/linux-2.6.23-rc1-mm2/drivers/mmc/host/mmci.c: In function 'mmci_request': /home/bunk/linux/kernel-2.6/linux-2.6.23-rc1-mm2/drivers/mmc/host/mmci.c:398: error: implicit declaration of function 'mmc_end_request' make[4]: *** [drivers/mmc/host/mmci.o] Error 1 <-- snip --> cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.23-rc1-mm2: MMC_ARMMMCI compile error
On Tue, Jul 31, 2007 at 11:09:32PM -0700, Andrew Morton wrote: ... Changes since 2.6.23-rc1-mm1: ... git-mmc.patch ... git trees ... CONFIG_MMC_ARMMMCI=m/y results in the following compile error: -- snip -- ... CC [M] drivers/mmc/host/mmci.o /home/bunk/linux/kernel-2.6/linux-2.6.23-rc1-mm2/drivers/mmc/host/mmci.c: In function 'mmci_request': /home/bunk/linux/kernel-2.6/linux-2.6.23-rc1-mm2/drivers/mmc/host/mmci.c:398: error: implicit declaration of function 'mmc_end_request' make[4]: *** [drivers/mmc/host/mmci.o] Error 1 -- snip -- cu Adrian -- Is there not promise of rain? Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. Only a promise, Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [NFS] 2.6.23-rc1-mm2
On 08/07, Trond Myklebust wrote: > > On Wed, 2007-08-08 at 02:20 +0400, Oleg Nesterov wrote: > > > But. nfs4_renew_state() checks list_empty(>cl_superblocks) under > > clp->cl_sem? So, if it is possible that clp->cl_renewd was scheduled > > at the time when nfs4_kill_renewd(), we can deadlock, no? Because > > nfs4_renew_state() needs clp->cl_sem to complete, but nfs4_kill_renewd() > > holds this sem, and waits for nfs4_renew_state() completion. > > They both take read locks, Aah. Please ignore me, thanks! > which means that they can take them > simultaneously. AFAICS, the deadlock can only occur if something manages > to insert a request for a write lock after nfs4_kill_renewd() takes its > read lock, but before nfs4_renew_state() takes its read lock: > > 1) nfs4_kill_renewd() 2) nfs4_renew_state() 3) somebody else > --- --- > read lock > wait on (2) to complete > write lock > > read lockbecause rw_semaphores > don't allow a read lock > request to jump a write > lock request> > > however as I explained earlier, the only process that can take a write > lock is the reclaimer daemon, but we _know_ that cannot be running (for > one thing, the reference count on nfs_client is zero, for the other, > there are no superblocks). Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [NFS] 2.6.23-rc1-mm2
On Wed, 2007-08-08 at 02:20 +0400, Oleg Nesterov wrote: > But. nfs4_renew_state() checks list_empty(>cl_superblocks) under > clp->cl_sem? So, if it is possible that clp->cl_renewd was scheduled > at the time when nfs4_kill_renewd(), we can deadlock, no? Because > nfs4_renew_state() needs clp->cl_sem to complete, but nfs4_kill_renewd() > holds this sem, and waits for nfs4_renew_state() completion. They both take read locks, which means that they can take them simultaneously. AFAICS, the deadlock can only occur if something manages to insert a request for a write lock after nfs4_kill_renewd() takes its read lock, but before nfs4_renew_state() takes its read lock: 1) nfs4_kill_renewd() 2) nfs4_renew_state() 3) somebody else --- -- - read lock wait on (2) to complete write lock read lock however as I explained earlier, the only process that can take a write lock is the reclaimer daemon, but we _know_ that cannot be running (for one thing, the reference count on nfs_client is zero, for the other, there are no superblocks). Cheers Trond - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [NFS] 2.6.23-rc1-mm2
On 08/07, Trond Myklebust wrote: > > On Wed, 2007-08-08 at 01:37 +0400, Oleg Nesterov wrote: > > On 08/07, Trond Myklebust wrote: > > > > > > On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote: > > > > On 08/03, Trond Myklebust wrote: > > > > > I'll have a look at this. I suspect that most if not all of our calls > > > > > to > > > > > run_workqueue()/flush_scheduled_work() can now be replaced by more > > > > > targeted calls to cancel_work_sync() and cancel_delayed_work_sync(). > > > > > > > > Yes, please, if possible. > > > > > > All the NFS and SUNRPC cases appear to be trivial. IOW: the only reason > > > for the flush_workqueue()/flush_scheduled_work() calls was to ensure > > > that the cancel_work()/cancel_delayed_work() calls preceding them have > > > completed. Nevertheless I've split the conversion into two patches, > > > since one touches only the NFS code, whereas the other touches the > > > SUNRPC client and server code. > > > > > > The two patches have been tested, and appear to work... > > > > Great! > > > > > void > > > nfs4_kill_renewd(struct nfs_client *clp) > > > { > > > down_read(>cl_sem); > > > - cancel_delayed_work(>cl_renewd); > > > + cancel_delayed_work_sync(>cl_renewd); > > > up_read(>cl_sem); > > > - flush_scheduled_work(); > > > } > > > > this looks unsafe to me, the window is very small, but afaics this can > > deadlock if called when nfs4_renew_state() has already started, but didn't > > take ->cl_sem yet. > > Not really. We have removed the nfs_client from the public lists, and we > are guaranteed that there are no more active superblocks attached to it > so nothing can call the reclaimer routine (which is the only routine > that takes a write lock on clp->cl_sem). Thanks for your explanation. Not that I was able to understand, nfs is a black magic to me :) But. nfs4_renew_state() checks list_empty(>cl_superblocks) under clp->cl_sem? So, if it is possible that clp->cl_renewd was scheduled at the time when nfs4_kill_renewd(), we can deadlock, no? Because nfs4_renew_state() needs clp->cl_sem to complete, but nfs4_kill_renewd() holds this sem, and waits for nfs4_renew_state() completion. > > Btw, unless I missed something, the code without this patch looks incorrect > > too: cancel_delayed_work() can fail if the timer expired, but the > > ->cl_renewd > > didn't run yet. In that case nfs4_renew_state() can run and re-schedule > > itself > > after flush_scheduled_work() returns. > > No, that should not be possible. Again, see above: there are no active > superblocks, so clp->cl_superblocks is empty. Yes, thanks. I missed "goto out" in nfs4_renew_state(). Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [NFS] 2.6.23-rc1-mm2
On Wed, 2007-08-08 at 01:37 +0400, Oleg Nesterov wrote: > On 08/07, Trond Myklebust wrote: > > > > On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote: > > > On 08/03, Trond Myklebust wrote: > > > > I'll have a look at this. I suspect that most if not all of our calls to > > > > run_workqueue()/flush_scheduled_work() can now be replaced by more > > > > targeted calls to cancel_work_sync() and cancel_delayed_work_sync(). > > > > > > Yes, please, if possible. > > > > All the NFS and SUNRPC cases appear to be trivial. IOW: the only reason > > for the flush_workqueue()/flush_scheduled_work() calls was to ensure > > that the cancel_work()/cancel_delayed_work() calls preceding them have > > completed. Nevertheless I've split the conversion into two patches, > > since one touches only the NFS code, whereas the other touches the > > SUNRPC client and server code. > > > > The two patches have been tested, and appear to work... > > Great! > > > void > > nfs4_kill_renewd(struct nfs_client *clp) > > { > > down_read(>cl_sem); > > - cancel_delayed_work(>cl_renewd); > > + cancel_delayed_work_sync(>cl_renewd); > > up_read(>cl_sem); > > - flush_scheduled_work(); > > } > > this looks unsafe to me, the window is very small, but afaics this can > deadlock if called when nfs4_renew_state() has already started, but didn't > take ->cl_sem yet. Not really. We have removed the nfs_client from the public lists, and we are guaranteed that there are no more active superblocks attached to it so nothing can call the reclaimer routine (which is the only routine that takes a write lock on clp->cl_sem). > Can't we avoid taking clp->cl_sem here? Yes, I believe that we can, for the same reasons as above: the race with the reclaimer is impossible, hence the read lock on cl_sem is redundant. > Btw, unless I missed something, the code without this patch looks incorrect > too: cancel_delayed_work() can fail if the timer expired, but the ->cl_renewd > didn't run yet. In that case nfs4_renew_state() can run and re-schedule itself > after flush_scheduled_work() returns. No, that should not be possible. Again, see above: there are no active superblocks, so clp->cl_superblocks is empty. Cheers Trond - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [NFS] 2.6.23-rc1-mm2
On 08/07, Trond Myklebust wrote: > > On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote: > > On 08/03, Trond Myklebust wrote: > > > I'll have a look at this. I suspect that most if not all of our calls to > > > run_workqueue()/flush_scheduled_work() can now be replaced by more > > > targeted calls to cancel_work_sync() and cancel_delayed_work_sync(). > > > > Yes, please, if possible. > > All the NFS and SUNRPC cases appear to be trivial. IOW: the only reason > for the flush_workqueue()/flush_scheduled_work() calls was to ensure > that the cancel_work()/cancel_delayed_work() calls preceding them have > completed. Nevertheless I've split the conversion into two patches, > since one touches only the NFS code, whereas the other touches the > SUNRPC client and server code. > > The two patches have been tested, and appear to work... Great! > void > nfs4_kill_renewd(struct nfs_client *clp) > { > down_read(>cl_sem); > - cancel_delayed_work(>cl_renewd); > + cancel_delayed_work_sync(>cl_renewd); > up_read(>cl_sem); > - flush_scheduled_work(); > } this looks unsafe to me, the window is very small, but afaics this can deadlock if called when nfs4_renew_state() has already started, but didn't take ->cl_sem yet. Can't we avoid taking clp->cl_sem here? Btw, unless I missed something, the code without this patch looks incorrect too: cancel_delayed_work() can fail if the timer expired, but the ->cl_renewd didn't run yet. In that case nfs4_renew_state() can run and re-schedule itself after flush_scheduled_work() returns. Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [NFS] 2.6.23-rc1-mm2
On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote: > On 08/03, Trond Myklebust wrote: > > I'll have a look at this. I suspect that most if not all of our calls to > > run_workqueue()/flush_scheduled_work() can now be replaced by more > > targeted calls to cancel_work_sync() and cancel_delayed_work_sync(). > > Yes, please, if possible. All the NFS and SUNRPC cases appear to be trivial. IOW: the only reason for the flush_workqueue()/flush_scheduled_work() calls was to ensure that the cancel_work()/cancel_delayed_work() calls preceding them have completed. Nevertheless I've split the conversion into two patches, since one touches only the NFS code, whereas the other touches the SUNRPC client and server code. The two patches have been tested, and appear to work... Trond --- Begin Message --- This will avoid deadlocks of the form: stack backtrace: [] show_trace_log_lvl+0x1a/0x30 [] show_trace+0x12/0x20 [] dump_stack+0x15/0x20 [] __lock_acquire+0xc22/0x1030 [] lock_acquire+0x61/0x80 [] flush_workqueue+0x49/0x70 [] flush_scheduled_work+0xd/0x10 [] nfs_release_automount_timer+0x2c/0x30 [nfs] [] nfs_free_server+0x9e/0xd0 [nfs] [] nfs_kill_super+0x16/0x20 [nfs] [] deactivate_super+0x7d/0xa0 [] mntput_no_expire+0x4b/0x80 [] expire_mount_list+0xe4/0x140 [] mark_mounts_for_expiry+0x99/0xb0 [] nfs_expire_automounts+0xd/0x40 [nfs] [] run_workqueue+0x12b/0x1e0 [] worker_thread+0x9b/0x100 [] kthread+0x42/0x70 [] kernel_thread_helper+0x7/0x18 === Signed-off-by: Trond Myklebust <[EMAIL PROTECTED]> --- fs/nfs/namespace.c |6 ++ fs/nfs/nfs4renewd.c |5 ++--- 2 files changed, 4 insertions(+), 7 deletions(-) diff --git a/fs/nfs/namespace.c b/fs/nfs/namespace.c index 7f86e65..aea76d0 100644 --- a/fs/nfs/namespace.c +++ b/fs/nfs/namespace.c @@ -175,10 +175,8 @@ static void nfs_expire_automounts(struct work_struct *work) void nfs_release_automount_timer(void) { - if (list_empty(_automount_list)) { - cancel_delayed_work(_automount_task); - flush_scheduled_work(); - } + if (list_empty(_automount_list)) + cancel_delayed_work_sync(_automount_task); } /* diff --git a/fs/nfs/nfs4renewd.c b/fs/nfs/nfs4renewd.c index 0505ca1..3ea352d 100644 --- a/fs/nfs/nfs4renewd.c +++ b/fs/nfs/nfs4renewd.c @@ -127,16 +127,15 @@ nfs4_schedule_state_renewal(struct nfs_client *clp) void nfs4_renewd_prepare_shutdown(struct nfs_server *server) { - flush_scheduled_work(); + cancel_delayed_work(>nfs_client->cl_renewd); } void nfs4_kill_renewd(struct nfs_client *clp) { down_read(>cl_sem); - cancel_delayed_work(>cl_renewd); + cancel_delayed_work_sync(>cl_renewd); up_read(>cl_sem); - flush_scheduled_work(); } /* --- End Message --- --- Begin Message --- Signed-off-by: Trond Myklebust <[EMAIL PROTECTED]> --- net/sunrpc/cache.c|3 +-- net/sunrpc/rpc_pipe.c |3 +-- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c index 01c3c41..ebe344f 100644 --- a/net/sunrpc/cache.c +++ b/net/sunrpc/cache.c @@ -371,8 +371,7 @@ int cache_unregister(struct cache_detail *cd) } if (list_empty(_list)) { /* module must be being unloaded so its safe to kill the worker */ - cancel_delayed_work(_cleaner); - flush_scheduled_work(); + cancel_delayed_work_sync(_cleaner); } return 0; } diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c index 650af06..669e12a 100644 --- a/net/sunrpc/rpc_pipe.c +++ b/net/sunrpc/rpc_pipe.c @@ -132,8 +132,7 @@ rpc_close_pipes(struct inode *inode) rpci->nwriters = 0; if (ops->release_pipe) ops->release_pipe(inode); - cancel_delayed_work(>queue_timeout); - flush_workqueue(rpciod_workqueue); + cancel_delayed_work_sync(>queue_timeout); } rpc_inode_setowner(inode, NULL); mutex_unlock(>i_mutex); --- End Message ---
Re: [NFS] 2.6.23-rc1-mm2
Am Monday 06 August 2007 18:24 schrieb Trond Myklebust: > On Mon, 2007-08-06 at 13:05 +0200, Marc Dietrich wrote: > > Hi, > > > > > (...) > > > > just booting into X is enough. > > > > I applied the patch, but now I get: > > > > ============= > > [ INFO: inconsistent lock state ] > > 2.6.23-rc1-mm2 #4 > > - > > inconsistent {softirq-on-W} -> {in-softirq-W} usage. > > swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes: > > (rpc_credcache_lock){-+..}, at: [] > > _atomic_dec_and_lock+0x17/0x60 {softirq-on-W} state was registered at: > > [] __lock_acquire+0x650/0x1030 > > [] lock_acquire+0x61/0x80 > > [] _spin_lock+0x2c/0x40 > > [] _atomic_dec_and_lock+0x17/0x60 > > [] put_rpccred+0x5d/0x100 [sunrpc] > > [] rpcauth_unbindcred+0x21/0x60 [sunrpc] > > [] a0 [sunrpc] > > [] rpc_call_sync+0x30/0x40 [sunrpc] > > [] rpcb_register+0xdb/0x180 [sunrpc] > > [] svc_register+0x93/0x160 [sunrpc] > > [] __svc_create+0x1ee/0x220 [sunrpc] > > [] svc_create+0x13/0x20 [sunrpc] > > [] nfs_callback_up+0x82/0x120 [nfs] > > [] nfs_get_client+0x176/0x390 [nfs] > > [] nfs4_set_client+0x31/0x190 [nfs] > > [] nfs4_create_server+0x63/0x3b0 [nfs] > > [] nfs4_get_sb+0x346/0x5b0 [nfs] > > [] vfs_kern_mount+0x94/0x110 > > [] do_mount+0x1f2/0x7d0 > > [] sys_mount+0x66/0xa0 > > [] syscall_call+0x7/0xb > > [] 0x > > irq event stamp: 5277830 > > hardirqs last enabled at (5277830): [] > > kmem_cache_free+0x8a/0xc0 hardirqs last disabled at (5277829): > > [] kmem_cache_free+0x52/0xc0 softirqs last enabled at > > (5277798): [] __do_softirq+0xa3/0xc0 softirqs last disabled at > > (5277817): [] do_softirq+0x47/0x50 > > > > other info that might help us debug this: > > no locks held by swapper/0. > > > > stack backtrace: > > [] show_trace_log_lvl+0x1a/0x30 > > [] show_trace+0x12/0x20 > > [] dump_stack+0x15/0x20 > > [] print_usage_bug+0x153/0x160 > > [] mark_lock+0x449/0x620 > > [] __lock_acquire+0x604/0x1030 > > [] lock_acquire+0x61/0x80 > > [] _spin_lock+0x2c/0x40 > > [] _atomic_dec_and_lock+0x17/0x60 > > [] put_rpccred+0x5d/0x100 [sunrpc] > > [] nfs_free_delegation_callback+0x13/0x20 [nfs] > > [] __rcu_process_callbacks+0x6a/0x1c0 > > [] rcu_process_callbacks+0x12/0x30 > > [] tasklet_action+0x38/0x80 > > [] __do_softirq+0x55/0xc0 > > [] do_softirq+0x47/0x50 > > [] irq_exit+0x35/0x40 > > [] smp_apic_timer_interrupt+0x43/0x80 > > [] apic_timer_interrupt+0x33/0x38 > > [] cpuidle_idle_call+0x6f/0x90 > > [] cpu_idle+0x43/0x70 > > [] rest_init+0x47/0x50 > > [] start_kernel+0x22a/0x2b0 > > [<>] 0x0 > > === > > That is a different matter. I assume this patch should suffice to fix > the above problem. > > Trond yes - it does. thanks. Marc -- "Our cause has a sacred nature." Lord Arthur Ponsonby, "Falsehood in Wartime: Propaganda Lies of the First World War", 1928 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [NFS] 2.6.23-rc1-mm2
Am Monday 06 August 2007 18:24 schrieb Trond Myklebust: On Mon, 2007-08-06 at 13:05 +0200, Marc Dietrich wrote: Hi, (...) just booting into X is enough. I applied the patch, but now I get: = [ INFO: inconsistent lock state ] 2.6.23-rc1-mm2 #4 - inconsistent {softirq-on-W} - {in-softirq-W} usage. swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes: (rpc_credcache_lock){-+..}, at: [c01dc487] _atomic_dec_and_lock+0x17/0x60 {softirq-on-W} state was registered at: [c013e870] __lock_acquire+0x650/0x1030 [c013f2b1] lock_acquire+0x61/0x80 [c02db9ac] _spin_lock+0x2c/0x40 [c01dc487] _atomic_dec_and_lock+0x17/0x60 [dced55fd] put_rpccred+0x5d/0x100 [sunrpc] [dced56c1] rpcauth_unbindcred+0x21/0x60 [sunrpc] [dced3fd4] a0 [sunrpc] [dcecefe0] rpc_call_sync+0x30/0x40 [sunrpc] [dcedc73b] rpcb_register+0xdb/0x180 [sunrpc] [dced65b3] svc_register+0x93/0x160 [sunrpc] [dced6ebe] __svc_create+0x1ee/0x220 [sunrpc] [dced7053] svc_create+0x13/0x20 [sunrpc] [dcf6d722] nfs_callback_up+0x82/0x120 [nfs] [dcf48f36] nfs_get_client+0x176/0x390 [nfs] [dcf49181] nfs4_set_client+0x31/0x190 [nfs] [dcf49983] nfs4_create_server+0x63/0x3b0 [nfs] [dcf52426] nfs4_get_sb+0x346/0x5b0 [nfs] [c017b444] vfs_kern_mount+0x94/0x110 [c0190a62] do_mount+0x1f2/0x7d0 [c01910a6] sys_mount+0x66/0xa0 [c0104046] syscall_call+0x7/0xb [] 0x irq event stamp: 5277830 hardirqs last enabled at (5277830): [c017530a] kmem_cache_free+0x8a/0xc0 hardirqs last disabled at (5277829): [c01752d2] kmem_cache_free+0x52/0xc0 softirqs last enabled at (5277798): [c0124173] __do_softirq+0xa3/0xc0 softirqs last disabled at (5277817): [c01241d7] do_softirq+0x47/0x50 other info that might help us debug this: no locks held by swapper/0. stack backtrace: [c0104fda] show_trace_log_lvl+0x1a/0x30 [c0105c02] show_trace+0x12/0x20 [c0105d15] dump_stack+0x15/0x20 [c013ccc3] print_usage_bug+0x153/0x160 [c013d8b9] mark_lock+0x449/0x620 [c013e824] __lock_acquire+0x604/0x1030 [c013f2b1] lock_acquire+0x61/0x80 [c02db9ac] _spin_lock+0x2c/0x40 [c01dc487] _atomic_dec_and_lock+0x17/0x60 [dced55fd] put_rpccred+0x5d/0x100 [sunrpc] [dcf6bf83] nfs_free_delegation_callback+0x13/0x20 [nfs] [c012f9ea] __rcu_process_callbacks+0x6a/0x1c0 [c012fb52] rcu_process_callbacks+0x12/0x30 [c0124218] tasklet_action+0x38/0x80 [c0124125] __do_softirq+0x55/0xc0 [c01241d7] do_softirq+0x47/0x50 [c0124605] irq_exit+0x35/0x40 [c0112463] smp_apic_timer_interrupt+0x43/0x80 [c0104a77] apic_timer_interrupt+0x33/0x38 [c02690df] cpuidle_idle_call+0x6f/0x90 [c01023c3] cpu_idle+0x43/0x70 [c02d8c27] rest_init+0x47/0x50 [c03bcb6a] start_kernel+0x22a/0x2b0 [] 0x0 === That is a different matter. I assume this patch should suffice to fix the above problem. Trond yes - it does. thanks. Marc -- Our cause has a sacred nature. Lord Arthur Ponsonby, Falsehood in Wartime: Propaganda Lies of the First World War, 1928 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [NFS] 2.6.23-rc1-mm2
On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote: On 08/03, Trond Myklebust wrote: I'll have a look at this. I suspect that most if not all of our calls to run_workqueue()/flush_scheduled_work() can now be replaced by more targeted calls to cancel_work_sync() and cancel_delayed_work_sync(). Yes, please, if possible. All the NFS and SUNRPC cases appear to be trivial. IOW: the only reason for the flush_workqueue()/flush_scheduled_work() calls was to ensure that the cancel_work()/cancel_delayed_work() calls preceding them have completed. Nevertheless I've split the conversion into two patches, since one touches only the NFS code, whereas the other touches the SUNRPC client and server code. The two patches have been tested, and appear to work... Trond ---BeginMessage--- This will avoid deadlocks of the form: stack backtrace: [c0104fda] show_trace_log_lvl+0x1a/0x30 [c0105c02] show_trace+0x12/0x20 [c0105d15] dump_stack+0x15/0x20 [c013ee42] __lock_acquire+0xc22/0x1030 [c013f2b1] lock_acquire+0x61/0x80 [c012edd9] flush_workqueue+0x49/0x70 [c012ee0d] flush_scheduled_work+0xd/0x10 [dcf55c0c] nfs_release_automount_timer+0x2c/0x30 [nfs] [dcf45d8e] nfs_free_server+0x9e/0xd0 [nfs] [dcf4e626] nfs_kill_super+0x16/0x20 [nfs] [c017b38d] deactivate_super+0x7d/0xa0 [c018f94b] mntput_no_expire+0x4b/0x80 [c018fd94] expire_mount_list+0xe4/0x140 [c0191219] mark_mounts_for_expiry+0x99/0xb0 [dcf55d1d] nfs_expire_automounts+0xd/0x40 [nfs] [c012e61b] run_workqueue+0x12b/0x1e0 [c012f05b] worker_thread+0x9b/0x100 [c0131c72] kthread+0x42/0x70 [c0104c0f] kernel_thread_helper+0x7/0x18 === Signed-off-by: Trond Myklebust [EMAIL PROTECTED] --- fs/nfs/namespace.c |6 ++ fs/nfs/nfs4renewd.c |5 ++--- 2 files changed, 4 insertions(+), 7 deletions(-) diff --git a/fs/nfs/namespace.c b/fs/nfs/namespace.c index 7f86e65..aea76d0 100644 --- a/fs/nfs/namespace.c +++ b/fs/nfs/namespace.c @@ -175,10 +175,8 @@ static void nfs_expire_automounts(struct work_struct *work) void nfs_release_automount_timer(void) { - if (list_empty(nfs_automount_list)) { - cancel_delayed_work(nfs_automount_task); - flush_scheduled_work(); - } + if (list_empty(nfs_automount_list)) + cancel_delayed_work_sync(nfs_automount_task); } /* diff --git a/fs/nfs/nfs4renewd.c b/fs/nfs/nfs4renewd.c index 0505ca1..3ea352d 100644 --- a/fs/nfs/nfs4renewd.c +++ b/fs/nfs/nfs4renewd.c @@ -127,16 +127,15 @@ nfs4_schedule_state_renewal(struct nfs_client *clp) void nfs4_renewd_prepare_shutdown(struct nfs_server *server) { - flush_scheduled_work(); + cancel_delayed_work(server-nfs_client-cl_renewd); } void nfs4_kill_renewd(struct nfs_client *clp) { down_read(clp-cl_sem); - cancel_delayed_work(clp-cl_renewd); + cancel_delayed_work_sync(clp-cl_renewd); up_read(clp-cl_sem); - flush_scheduled_work(); } /* ---End Message--- ---BeginMessage--- Signed-off-by: Trond Myklebust [EMAIL PROTECTED] --- net/sunrpc/cache.c|3 +-- net/sunrpc/rpc_pipe.c |3 +-- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c index 01c3c41..ebe344f 100644 --- a/net/sunrpc/cache.c +++ b/net/sunrpc/cache.c @@ -371,8 +371,7 @@ int cache_unregister(struct cache_detail *cd) } if (list_empty(cache_list)) { /* module must be being unloaded so its safe to kill the worker */ - cancel_delayed_work(cache_cleaner); - flush_scheduled_work(); + cancel_delayed_work_sync(cache_cleaner); } return 0; } diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c index 650af06..669e12a 100644 --- a/net/sunrpc/rpc_pipe.c +++ b/net/sunrpc/rpc_pipe.c @@ -132,8 +132,7 @@ rpc_close_pipes(struct inode *inode) rpci-nwriters = 0; if (ops-release_pipe) ops-release_pipe(inode); - cancel_delayed_work(rpci-queue_timeout); - flush_workqueue(rpciod_workqueue); + cancel_delayed_work_sync(rpci-queue_timeout); } rpc_inode_setowner(inode, NULL); mutex_unlock(inode-i_mutex); ---End Message---
Re: [NFS] 2.6.23-rc1-mm2
On 08/07, Trond Myklebust wrote: On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote: On 08/03, Trond Myklebust wrote: I'll have a look at this. I suspect that most if not all of our calls to run_workqueue()/flush_scheduled_work() can now be replaced by more targeted calls to cancel_work_sync() and cancel_delayed_work_sync(). Yes, please, if possible. All the NFS and SUNRPC cases appear to be trivial. IOW: the only reason for the flush_workqueue()/flush_scheduled_work() calls was to ensure that the cancel_work()/cancel_delayed_work() calls preceding them have completed. Nevertheless I've split the conversion into two patches, since one touches only the NFS code, whereas the other touches the SUNRPC client and server code. The two patches have been tested, and appear to work... Great! void nfs4_kill_renewd(struct nfs_client *clp) { down_read(clp-cl_sem); - cancel_delayed_work(clp-cl_renewd); + cancel_delayed_work_sync(clp-cl_renewd); up_read(clp-cl_sem); - flush_scheduled_work(); } this looks unsafe to me, the window is very small, but afaics this can deadlock if called when nfs4_renew_state() has already started, but didn't take -cl_sem yet. Can't we avoid taking clp-cl_sem here? Btw, unless I missed something, the code without this patch looks incorrect too: cancel_delayed_work() can fail if the timer expired, but the -cl_renewd didn't run yet. In that case nfs4_renew_state() can run and re-schedule itself after flush_scheduled_work() returns. Oleg. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [NFS] 2.6.23-rc1-mm2
On Wed, 2007-08-08 at 01:37 +0400, Oleg Nesterov wrote: On 08/07, Trond Myklebust wrote: On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote: On 08/03, Trond Myklebust wrote: I'll have a look at this. I suspect that most if not all of our calls to run_workqueue()/flush_scheduled_work() can now be replaced by more targeted calls to cancel_work_sync() and cancel_delayed_work_sync(). Yes, please, if possible. All the NFS and SUNRPC cases appear to be trivial. IOW: the only reason for the flush_workqueue()/flush_scheduled_work() calls was to ensure that the cancel_work()/cancel_delayed_work() calls preceding them have completed. Nevertheless I've split the conversion into two patches, since one touches only the NFS code, whereas the other touches the SUNRPC client and server code. The two patches have been tested, and appear to work... Great! void nfs4_kill_renewd(struct nfs_client *clp) { down_read(clp-cl_sem); - cancel_delayed_work(clp-cl_renewd); + cancel_delayed_work_sync(clp-cl_renewd); up_read(clp-cl_sem); - flush_scheduled_work(); } this looks unsafe to me, the window is very small, but afaics this can deadlock if called when nfs4_renew_state() has already started, but didn't take -cl_sem yet. Not really. We have removed the nfs_client from the public lists, and we are guaranteed that there are no more active superblocks attached to it so nothing can call the reclaimer routine (which is the only routine that takes a write lock on clp-cl_sem). Can't we avoid taking clp-cl_sem here? Yes, I believe that we can, for the same reasons as above: the race with the reclaimer is impossible, hence the read lock on cl_sem is redundant. Btw, unless I missed something, the code without this patch looks incorrect too: cancel_delayed_work() can fail if the timer expired, but the -cl_renewd didn't run yet. In that case nfs4_renew_state() can run and re-schedule itself after flush_scheduled_work() returns. No, that should not be possible. Again, see above: there are no active superblocks, so clp-cl_superblocks is empty. Cheers Trond - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [NFS] 2.6.23-rc1-mm2
On 08/07, Trond Myklebust wrote: On Wed, 2007-08-08 at 01:37 +0400, Oleg Nesterov wrote: On 08/07, Trond Myklebust wrote: On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote: On 08/03, Trond Myklebust wrote: I'll have a look at this. I suspect that most if not all of our calls to run_workqueue()/flush_scheduled_work() can now be replaced by more targeted calls to cancel_work_sync() and cancel_delayed_work_sync(). Yes, please, if possible. All the NFS and SUNRPC cases appear to be trivial. IOW: the only reason for the flush_workqueue()/flush_scheduled_work() calls was to ensure that the cancel_work()/cancel_delayed_work() calls preceding them have completed. Nevertheless I've split the conversion into two patches, since one touches only the NFS code, whereas the other touches the SUNRPC client and server code. The two patches have been tested, and appear to work... Great! void nfs4_kill_renewd(struct nfs_client *clp) { down_read(clp-cl_sem); - cancel_delayed_work(clp-cl_renewd); + cancel_delayed_work_sync(clp-cl_renewd); up_read(clp-cl_sem); - flush_scheduled_work(); } this looks unsafe to me, the window is very small, but afaics this can deadlock if called when nfs4_renew_state() has already started, but didn't take -cl_sem yet. Not really. We have removed the nfs_client from the public lists, and we are guaranteed that there are no more active superblocks attached to it so nothing can call the reclaimer routine (which is the only routine that takes a write lock on clp-cl_sem). Thanks for your explanation. Not that I was able to understand, nfs is a black magic to me :) But. nfs4_renew_state() checks list_empty(clp-cl_superblocks) under clp-cl_sem? So, if it is possible that clp-cl_renewd was scheduled at the time when nfs4_kill_renewd(), we can deadlock, no? Because nfs4_renew_state() needs clp-cl_sem to complete, but nfs4_kill_renewd() holds this sem, and waits for nfs4_renew_state() completion. Btw, unless I missed something, the code without this patch looks incorrect too: cancel_delayed_work() can fail if the timer expired, but the -cl_renewd didn't run yet. In that case nfs4_renew_state() can run and re-schedule itself after flush_scheduled_work() returns. No, that should not be possible. Again, see above: there are no active superblocks, so clp-cl_superblocks is empty. Yes, thanks. I missed goto out in nfs4_renew_state(). Oleg. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [NFS] 2.6.23-rc1-mm2
On Wed, 2007-08-08 at 02:20 +0400, Oleg Nesterov wrote: But. nfs4_renew_state() checks list_empty(clp-cl_superblocks) under clp-cl_sem? So, if it is possible that clp-cl_renewd was scheduled at the time when nfs4_kill_renewd(), we can deadlock, no? Because nfs4_renew_state() needs clp-cl_sem to complete, but nfs4_kill_renewd() holds this sem, and waits for nfs4_renew_state() completion. They both take read locks, which means that they can take them simultaneously. AFAICS, the deadlock can only occur if something manages to insert a request for a write lock after nfs4_kill_renewd() takes its read lock, but before nfs4_renew_state() takes its read lock: 1) nfs4_kill_renewd() 2) nfs4_renew_state() 3) somebody else --- -- - read lock wait on (2) to complete write lock waits on (1) read lock waits on (3), because rw_semaphores don't allow a read lock request to jump a write lock request however as I explained earlier, the only process that can take a write lock is the reclaimer daemon, but we _know_ that cannot be running (for one thing, the reference count on nfs_client is zero, for the other, there are no superblocks). Cheers Trond - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [NFS] 2.6.23-rc1-mm2
On 08/07, Trond Myklebust wrote: On Wed, 2007-08-08 at 02:20 +0400, Oleg Nesterov wrote: But. nfs4_renew_state() checks list_empty(clp-cl_superblocks) under clp-cl_sem? So, if it is possible that clp-cl_renewd was scheduled at the time when nfs4_kill_renewd(), we can deadlock, no? Because nfs4_renew_state() needs clp-cl_sem to complete, but nfs4_kill_renewd() holds this sem, and waits for nfs4_renew_state() completion. They both take read locks, Aah. Please ignore me, thanks! which means that they can take them simultaneously. AFAICS, the deadlock can only occur if something manages to insert a request for a write lock after nfs4_kill_renewd() takes its read lock, but before nfs4_renew_state() takes its read lock: 1) nfs4_kill_renewd() 2) nfs4_renew_state() 3) somebody else --- --- read lock wait on (2) to complete write lock waits on (1) read lock waits on (3), because rw_semaphores don't allow a read lock request to jump a write lock request however as I explained earlier, the only process that can take a write lock is the reclaimer daemon, but we _know_ that cannot be running (for one thing, the reference count on nfs_client is zero, for the other, there are no superblocks). Oleg. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
It seems like things go wrong when lparmap.s is generated with (DWARF) debug info; could you try building it (manually) with -g0 added on the end of the compile line, and see if head_64.o compiles okay for you then? If so, I'll prepare a proper patch for it, I have a similar one (also for lparmap!) in my queue already... Ok it worked. I had to add -g0 to Makefile under arch/powerpc/kernel because -g0 was added before -g and didn't have any effect when adding to Makefile in top dir. Yeah, that's why I said "build lparmap.s manually" :-) But yes - it compiles now. Great, I'll combine it with my other lparmap build patch then. Thanks for the report and testing! Segher - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
> >>> Second issue as reported earilier allmodconfig fails to build on imac > >>> g3. > >>> > >>> CC arch/powerpc/kernel/lparmap.s > >>> AS arch/powerpc/kernel/head_64.o > >>> lparmap.c: Assembler messages: > >>> lparmap.c:84: Error: file number 1 already allocated > >>> make[1]: *** [arch/powerpc/kernel/head_64.o] Blad 1 > >>> make: *** [arch/powerpc/kernel] Blad 2 > >> > >> Please send me the full output of: > >> > >> gcc --version(or whatever your gcc is called) > >> ld --version > >> ld --help(I know no better way to get the supported binutils > >>targets, and the default target) > >> > >> and the lparmap.s file. You might want to skip sending it > >> to the lists, it will be a bit big (and off-topic on most > >> of those lists, anyway). > > > > Well ... its 66kB. Not that bad. Please find it attached. > > Needed gcc and ld info below. > > Thanks. > > It seems like things go wrong when lparmap.s is generated with > (DWARF) debug info; could you try building it (manually) with -g0 > added on the end of the compile line, and see if head_64.o compiles > okay for you then? If so, I'll prepare a proper patch for it, I > have a similar one (also for lparmap!) in my queue already... Ok it worked. I had to add -g0 to Makefile under arch/powerpc/kernel because -g0 was added before -g and didn't have any effect when adding to Makefile in top dir. But yes - it compiles now. Thanks, Mariusz - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
Second issue as reported earilier allmodconfig fails to build on imac g3. CC arch/powerpc/kernel/lparmap.s AS arch/powerpc/kernel/head_64.o lparmap.c: Assembler messages: lparmap.c:84: Error: file number 1 already allocated make[1]: *** [arch/powerpc/kernel/head_64.o] Blad 1 make: *** [arch/powerpc/kernel] Blad 2 Please send me the full output of: gcc --version(or whatever your gcc is called) ld --version ld --help(I know no better way to get the supported binutils targets, and the default target) and the lparmap.s file. You might want to skip sending it to the lists, it will be a bit big (and off-topic on most of those lists, anyway). Well ... its 66kB. Not that bad. Please find it attached. Needed gcc and ld info below. Thanks. It seems like things go wrong when lparmap.s is generated with (DWARF) debug info; could you try building it (manually) with -g0 added on the end of the compile line, and see if head_64.o compiles okay for you then? If so, I'll prepare a proper patch for it, I have a similar one (also for lparmap!) in my queue already... Segher - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
Some how your defconfig is targeting a PPC64 box: CONFIG_PPC64=y shouldn't be set if you want to build a kernel for a G3 imac. allyesconfig/allmodconfig select a 64-bit build always. Maybe it shouldn't. Segher - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
On 2 aug 2007, at 12:14, Mariusz Kozlowski wrote: Second issue as reported earilier allmodconfig fails to build on imac g3. Do you really mean g3? If so it's a 32-bit kernel and it shouldn't be building lparmap.s. It might be a bug nevertheless, there are more "issues" with the interesting way lparmap.s is built and used. Segher - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
Second issue as reported earilier allmodconfig fails to build on imac g3. CC arch/powerpc/kernel/lparmap.s AS arch/powerpc/kernel/head_64.o lparmap.c: Assembler messages: lparmap.c:84: Error: file number 1 already allocated make[1]: *** [arch/powerpc/kernel/head_64.o] Blad 1 make: *** [arch/powerpc/kernel] Blad 2 Please send me the full output of: gcc --version(or whatever your gcc is called) ld --version ld --help(I know no better way to get the supported binutils targets, and the default target) and the lparmap.s file. You might want to skip sending it to the lists, it will be a bit big (and off-topic on most of those lists, anyway). Segher - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [NFS] 2.6.23-rc1-mm2
On Mon, 2007-08-06 at 13:05 +0200, Marc Dietrich wrote: > Hi, > > Am Monday 06 August 2007 08:24 schrieb Johannes Berg: > > On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote: > > > To avoid a possible confusion: it is still OK if work->func() flushes > > > its own workqueue, so strictly speaking this trace is false positive, > > > but it would be very nice if we can get rid of this practice. > > > > I just had a thought: we could get rid of this warning by using a > > read-lock here. That way, flushing from within a work function (which > > would be seen as read-after-read recursive lock) won't trigger this > > warning. Patch below. This would, however, also get rid of any warnings > > for run_workqueue recursion. Which again we may or may not want, the > > code inidicates that it should be allowed up to a depth of three. > > > > However, the question whether we should allow flush_workqueue from > > within a struct work is mainly an API policy issue; it doesn't hurt to > > flush a workqueue from within a work, but it is probably nearer the > > intent to use targeted cancel_work_sync() or such. OTOH, one could > > imagine situations where multiple different work structs are on that > > workqueue belonging to the same subsystem and then the general > > flush_scheduled_work() call is the only way to guarantee nothing is on > > scheduled at a given point... I don't feel qualified to make the > > decision for or against allowing this use of the API at this point. > > > > Marc, do you have an easy way to trigger this warning? Could you verify > > that it goes away with the patch below applied? > > just booting into X is enough. > > I applied the patch, but now I get: > > = > [ INFO: inconsistent lock state ] > 2.6.23-rc1-mm2 #4 > - > inconsistent {softirq-on-W} -> {in-softirq-W} usage. > swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes: > (rpc_credcache_lock){-+..}, at: [] _atomic_dec_and_lock+0x17/0x60 > {softirq-on-W} state was registered at: > [] __lock_acquire+0x650/0x1030 > [] lock_acquire+0x61/0x80 > [] _spin_lock+0x2c/0x40 > [] _atomic_dec_and_lock+0x17/0x60 > [] put_rpccred+0x5d/0x100 [sunrpc] > [] rpcauth_unbindcred+0x21/0x60 [sunrpc] > [] a0 [sunrpc] > [] rpc_call_sync+0x30/0x40 [sunrpc] > [] rpcb_register+0xdb/0x180 [sunrpc] > [] svc_register+0x93/0x160 [sunrpc] > [] __svc_create+0x1ee/0x220 [sunrpc] > [] svc_create+0x13/0x20 [sunrpc] > [] nfs_callback_up+0x82/0x120 [nfs] > [] nfs_get_client+0x176/0x390 [nfs] > [] nfs4_set_client+0x31/0x190 [nfs] > [] nfs4_create_server+0x63/0x3b0 [nfs] > [] nfs4_get_sb+0x346/0x5b0 [nfs] > [] vfs_kern_mount+0x94/0x110 > [] do_mount+0x1f2/0x7d0 > [] sys_mount+0x66/0xa0 > [] syscall_call+0x7/0xb > [] 0x > irq event stamp: 5277830 > hardirqs last enabled at (5277830): [] kmem_cache_free+0x8a/0xc0 > hardirqs last disabled at (5277829): [] kmem_cache_free+0x52/0xc0 > softirqs last enabled at (5277798): [] __do_softirq+0xa3/0xc0 > softirqs last disabled at (5277817): [] do_softirq+0x47/0x50 > > other info that might help us debug this: > no locks held by swapper/0. > > stack backtrace: > [] show_trace_log_lvl+0x1a/0x30 > [] show_trace+0x12/0x20 > [] dump_stack+0x15/0x20 > [] print_usage_bug+0x153/0x160 > [] mark_lock+0x449/0x620 > [] __lock_acquire+0x604/0x1030 > [] lock_acquire+0x61/0x80 > [] _spin_lock+0x2c/0x40 > [] _atomic_dec_and_lock+0x17/0x60 > [] put_rpccred+0x5d/0x100 [sunrpc] > [] nfs_free_delegation_callback+0x13/0x20 [nfs] > [] __rcu_process_callbacks+0x6a/0x1c0 > [] rcu_process_callbacks+0x12/0x30 > [] tasklet_action+0x38/0x80 > [] __do_softirq+0x55/0xc0 > [] do_softirq+0x47/0x50 > [] irq_exit+0x35/0x40 > [] smp_apic_timer_interrupt+0x43/0x80 > [] apic_timer_interrupt+0x33/0x38 > [] cpuidle_idle_call+0x6f/0x90 > [] cpu_idle+0x43/0x70 > [] rest_init+0x47/0x50 > [] start_kernel+0x22a/0x2b0 > [<>] 0x0 > === That is a different matter. I assume this patch should suffice to fix the above problem. Trond --- Begin Message --- Doing so would require us to introduce bh-safe locks into put_rpccred(). Signed-off-by: Trond Myklebust <[EMAIL PROTECTED]> --- fs/nfs/delegation.c | 21 +++-- 1 files changed, 15 insertions(+), 6 deletions(-) diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c index 20ac403..c55a761 100644 --- a/fs/nfs/delegation.c +++ b/fs/nfs/delegation.c @@ -20,10 +20,8 @@ #include "delegation.h" #include "internal.h" -static void nfs_f
Re: [linux-usb-devel] 2.6.23-rc1-mm2 + cpufreq patch + hot-fixes -- [] usb_stor_scan_thread+0xbd/0x15a [usb_storage]
2007/8/6, Alan Stern <[EMAIL PROTECTED]>: > On Sat, 4 Aug 2007, Miles Lane wrote: > > > Initializing USB Mass Storage driver... > > usb-storage 4-3:1.0: usb_probe_interface > > usb-storage 4-3:1.0: usb_probe_interface - got id > > scsi2 : SCSI emulation for USB Mass Storage devices > > usbcore: registered new interface driver usb-storage > > usb-storage: device found at 2 > > usb-storage: waiting for device to settle before scanning > > schedule_timeout: wrong timeout value f8ea51d2 > > [] show_trace_log_lvl+0x12/0x25 > > [] show_trace+0xd/0x10 > > [] dump_stack+0x16/0x18 > > [] schedule_timeout+0x2c/0x8b > > [] usb_stor_scan_thread+0xbd/0x15a [usb_storage] > > [] kthread+0x3b/0x63 > > [] kernel_thread_helper+0x7/0x10 > > === > > Does this happen repeatably? > > Did you set usb-storage's delay_use parameter to something peculiar? I also have same problem. It is caused by http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm2/broken-out/freezer-introduce-freezer-firendly-waiting-macros.patch The patch below may not be good fix. But it shows what is problem. Index: 2.6-mm/include/linux/freezer.h === --- 2.6-mm.orig/include/linux/freezer.h +++ 2.6-mm/include/linux/freezer.h @@ -149,13 +149,13 @@ static inline void set_freezable(void) #define wait_event_freezable_timeout(wq, condition, timeout) \ ({ \ - long __ret = timeout; \ + long ret = timeout; \ do {\ - __ret = wait_event_interruptible_timeout(wq,\ + ret = wait_event_interruptible_timeout(wq, \ (condition) || freezing(current), \ - __ret); \ + ret); \ } while (try_to_freeze()); \ - __ret; \ + ret;\ }) #else /* !CONFIG_PM_SLEEP */ static inline int frozen(struct task_struct *p) { return 0; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-usb-devel] 2.6.23-rc1-mm2 + cpufreq patch + hot-fixes -- [] usb_stor_scan_thread+0xbd/0x15a [usb_storage]
On Sat, 4 Aug 2007, Miles Lane wrote: > Initializing USB Mass Storage driver... > usb-storage 4-3:1.0: usb_probe_interface > usb-storage 4-3:1.0: usb_probe_interface - got id > scsi2 : SCSI emulation for USB Mass Storage devices > usbcore: registered new interface driver usb-storage > usb-storage: device found at 2 > usb-storage: waiting for device to settle before scanning > schedule_timeout: wrong timeout value f8ea51d2 > [] show_trace_log_lvl+0x12/0x25 > [] show_trace+0xd/0x10 > [] dump_stack+0x16/0x18 > [] schedule_timeout+0x2c/0x8b > [] usb_stor_scan_thread+0xbd/0x15a [usb_storage] > [] kthread+0x3b/0x63 > [] kernel_thread_helper+0x7/0x10 > === Does this happen repeatably? Did you set usb-storage's delay_use parameter to something peculiar? Alan Stern - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [NFS] 2.6.23-rc1-mm2
On Mon, 2007-08-06 at 13:05 +0200, Marc Dietrich wrote: > I applied the patch, but now I get: > > = > [ INFO: inconsistent lock state ] > 2.6.23-rc1-mm2 #4 > - > inconsistent {softirq-on-W} -> {in-softirq-W} usage. > swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes: > (rpc_credcache_lock){-+..}, at: [] _atomic_dec_and_lock+0x17/0x60 Interesting, but doesn't seem related to this at all. As Oleg just pointed out this basically disabled checking for workqueue stuff so this should be looked into by somebody familiar with the NFS code. johannes signature.asc Description: This is a digitally signed message part
Re: [NFS] 2.6.23-rc1-mm2
Hi, Am Monday 06 August 2007 08:24 schrieb Johannes Berg: > On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote: > > To avoid a possible confusion: it is still OK if work->func() flushes > > its own workqueue, so strictly speaking this trace is false positive, > > but it would be very nice if we can get rid of this practice. > > I just had a thought: we could get rid of this warning by using a > read-lock here. That way, flushing from within a work function (which > would be seen as read-after-read recursive lock) won't trigger this > warning. Patch below. This would, however, also get rid of any warnings > for run_workqueue recursion. Which again we may or may not want, the > code inidicates that it should be allowed up to a depth of three. > > However, the question whether we should allow flush_workqueue from > within a struct work is mainly an API policy issue; it doesn't hurt to > flush a workqueue from within a work, but it is probably nearer the > intent to use targeted cancel_work_sync() or such. OTOH, one could > imagine situations where multiple different work structs are on that > workqueue belonging to the same subsystem and then the general > flush_scheduled_work() call is the only way to guarantee nothing is on > scheduled at a given point... I don't feel qualified to make the > decision for or against allowing this use of the API at this point. > > Marc, do you have an easy way to trigger this warning? Could you verify > that it goes away with the patch below applied? just booting into X is enough. I applied the patch, but now I get: = [ INFO: inconsistent lock state ] 2.6.23-rc1-mm2 #4 - inconsistent {softirq-on-W} -> {in-softirq-W} usage. swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes: (rpc_credcache_lock){-+..}, at: [] _atomic_dec_and_lock+0x17/0x60 {softirq-on-W} state was registered at: [] __lock_acquire+0x650/0x1030 [] lock_acquire+0x61/0x80 [] _spin_lock+0x2c/0x40 [] _atomic_dec_and_lock+0x17/0x60 [] put_rpccred+0x5d/0x100 [sunrpc] [] rpcauth_unbindcred+0x21/0x60 [sunrpc] [] a0 [sunrpc] [] rpc_call_sync+0x30/0x40 [sunrpc] [] rpcb_register+0xdb/0x180 [sunrpc] [] svc_register+0x93/0x160 [sunrpc] [] __svc_create+0x1ee/0x220 [sunrpc] [] svc_create+0x13/0x20 [sunrpc] [] nfs_callback_up+0x82/0x120 [nfs] [] nfs_get_client+0x176/0x390 [nfs] [] nfs4_set_client+0x31/0x190 [nfs] [] nfs4_create_server+0x63/0x3b0 [nfs] [] nfs4_get_sb+0x346/0x5b0 [nfs] [] vfs_kern_mount+0x94/0x110 [] do_mount+0x1f2/0x7d0 [] sys_mount+0x66/0xa0 [] syscall_call+0x7/0xb [] 0x irq event stamp: 5277830 hardirqs last enabled at (5277830): [] kmem_cache_free+0x8a/0xc0 hardirqs last disabled at (5277829): [] kmem_cache_free+0x52/0xc0 softirqs last enabled at (5277798): [] __do_softirq+0xa3/0xc0 softirqs last disabled at (5277817): [] do_softirq+0x47/0x50 other info that might help us debug this: no locks held by swapper/0. stack backtrace: [] show_trace_log_lvl+0x1a/0x30 [] show_trace+0x12/0x20 [] dump_stack+0x15/0x20 [] print_usage_bug+0x153/0x160 [] mark_lock+0x449/0x620 [] __lock_acquire+0x604/0x1030 [] lock_acquire+0x61/0x80 [] _spin_lock+0x2c/0x40 [] _atomic_dec_and_lock+0x17/0x60 [] put_rpccred+0x5d/0x100 [sunrpc] [] nfs_free_delegation_callback+0x13/0x20 [nfs] [] __rcu_process_callbacks+0x6a/0x1c0 [] rcu_process_callbacks+0x12/0x30 [] tasklet_action+0x38/0x80 [] __do_softirq+0x55/0xc0 [] do_softirq+0x47/0x50 [] irq_exit+0x35/0x40 [] smp_apic_timer_interrupt+0x43/0x80 [] apic_timer_interrupt+0x33/0x38 [] cpuidle_idle_call+0x6f/0x90 [] cpu_idle+0x43/0x70 [] rest_init+0x47/0x50 [] start_kernel+0x22a/0x2b0 [<>] 0x0 === also, sometimes this kernel hangs because of nfs accessing processes remain in D state. Marc > johannes > > --- > kernel/workqueue.c |6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > --- wireless-dev.orig/kernel/workqueue.c 2007-08-06 08:11:23.297846657 > +0200 +++ wireless-dev/kernel/workqueue.c 2007-08-06 08:19:54.727846657 > +0200 @@ -272,7 +272,7 @@ static void run_workqueue(struct cpu_wor > > BUG_ON(get_wq_data(work) != cwq); > work_clear_pending(work); > - lock_acquire(>wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_); > + lock_acquire(>wq->lockdep_map, 0, 0, 1, 2, _THIS_IP_); > lock_acquire(_map, 0, 0, 0, 2, _THIS_IP_); > f(work); > lock_release(_map, 1, _THIS_IP_); > @@ -395,7 +395,7 @@ void fastcall flush_workqueue(struct wor > int cpu; > > might_sleep(); > - lock_acquire(>lockdep_map, 0, 0, 0, 2, _THIS_IP_); > + lock_acquire(>lockdep_map, 0, 0, 1, 2, _THIS_IP_); > lock_release(>lockdep_map,
Re: [NFS] 2.6.23-rc1-mm2
On Mon, 2007-08-06 at 14:53 +0400, Oleg Nesterov wrote: > But this makes ->lockdep_map meaningless? We always take wq->lockdep_map > for reading, now we can't detect deadlocks. > > read_lock(A); > lock(B); > > vs > lock(B); > read_lock(A); > > is valid, kernel/lockdep.c should not complain. Ah, hmm. Good point, I guess you can always have multiple read locks. Then we'd have to make a new parameter or such to get rid of the recursive locking try message. But if you want to deprecate the API anyway then this is a good way to find it. johannes signature.asc Description: This is a digitally signed message part
Re: [NFS] 2.6.23-rc1-mm2
On 08/06, Johannes Berg wrote: > > On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote: > > > To avoid a possible confusion: it is still OK if work->func() flushes > > its own workqueue, so strictly speaking this trace is false positive, > > but it would be very nice if we can get rid of this practice. > > However, the question whether we should allow flush_workqueue from > within a struct work is mainly an API policy issue; it doesn't hurt to > flush a workqueue from within a work, I am not sure, but currently I hope we can forbid this eventually, so I personally think it is good that your patch complains. > --- wireless-dev.orig/kernel/workqueue.c 2007-08-06 08:11:23.297846657 > +0200 > +++ wireless-dev/kernel/workqueue.c 2007-08-06 08:19:54.727846657 +0200 > @@ -272,7 +272,7 @@ static void run_workqueue(struct cpu_wor > > BUG_ON(get_wq_data(work) != cwq); > work_clear_pending(work); > - lock_acquire(>wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_); > + lock_acquire(>wq->lockdep_map, 0, 0, 1, 2, _THIS_IP_); > lock_acquire(_map, 0, 0, 0, 2, _THIS_IP_); > f(work); > lock_release(_map, 1, _THIS_IP_); > @@ -395,7 +395,7 @@ void fastcall flush_workqueue(struct wor > int cpu; > > might_sleep(); > - lock_acquire(>lockdep_map, 0, 0, 0, 2, _THIS_IP_); > + lock_acquire(>lockdep_map, 0, 0, 1, 2, _THIS_IP_); > lock_release(>lockdep_map, 1, _THIS_IP_); > for_each_cpu_mask(cpu, *cpu_map) > flush_cpu_workqueue(per_cpu_ptr(wq->cpu_wq, cpu)); > @@ -779,7 +779,7 @@ static void cleanup_workqueue_thread(str > if (cwq->thread == NULL) > return; > > - lock_acquire(>wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_); > + lock_acquire(>wq->lockdep_map, 0, 0, 1, 2, _THIS_IP_); > lock_release(>wq->lockdep_map, 1, _THIS_IP_); > > flush_cpu_workqueue(cwq); But this makes ->lockdep_map meaningless? We always take wq->lockdep_map for reading, now we can't detect deadlocks. read_lock(A); lock(B); vs lock(B); read_lock(A); is valid, kernel/lockdep.c should not complain. No? Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [NFS] 2.6.23-rc1-mm2
On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote: > To avoid a possible confusion: it is still OK if work->func() flushes > its own workqueue, so strictly speaking this trace is false positive, > but it would be very nice if we can get rid of this practice. I just had a thought: we could get rid of this warning by using a read-lock here. That way, flushing from within a work function (which would be seen as read-after-read recursive lock) won't trigger this warning. Patch below. This would, however, also get rid of any warnings for run_workqueue recursion. Which again we may or may not want, the code inidicates that it should be allowed up to a depth of three. However, the question whether we should allow flush_workqueue from within a struct work is mainly an API policy issue; it doesn't hurt to flush a workqueue from within a work, but it is probably nearer the intent to use targeted cancel_work_sync() or such. OTOH, one could imagine situations where multiple different work structs are on that workqueue belonging to the same subsystem and then the general flush_scheduled_work() call is the only way to guarantee nothing is on scheduled at a given point... I don't feel qualified to make the decision for or against allowing this use of the API at this point. Marc, do you have an easy way to trigger this warning? Could you verify that it goes away with the patch below applied? johannes --- kernel/workqueue.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) --- wireless-dev.orig/kernel/workqueue.c2007-08-06 08:11:23.297846657 +0200 +++ wireless-dev/kernel/workqueue.c 2007-08-06 08:19:54.727846657 +0200 @@ -272,7 +272,7 @@ static void run_workqueue(struct cpu_wor BUG_ON(get_wq_data(work) != cwq); work_clear_pending(work); - lock_acquire(>wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_); + lock_acquire(>wq->lockdep_map, 0, 0, 1, 2, _THIS_IP_); lock_acquire(_map, 0, 0, 0, 2, _THIS_IP_); f(work); lock_release(_map, 1, _THIS_IP_); @@ -395,7 +395,7 @@ void fastcall flush_workqueue(struct wor int cpu; might_sleep(); - lock_acquire(>lockdep_map, 0, 0, 0, 2, _THIS_IP_); + lock_acquire(>lockdep_map, 0, 0, 1, 2, _THIS_IP_); lock_release(>lockdep_map, 1, _THIS_IP_); for_each_cpu_mask(cpu, *cpu_map) flush_cpu_workqueue(per_cpu_ptr(wq->cpu_wq, cpu)); @@ -779,7 +779,7 @@ static void cleanup_workqueue_thread(str if (cwq->thread == NULL) return; - lock_acquire(>wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_); + lock_acquire(>wq->lockdep_map, 0, 0, 1, 2, _THIS_IP_); lock_release(>wq->lockdep_map, 1, _THIS_IP_); flush_cpu_workqueue(cwq); signature.asc Description: This is a digitally signed message part
Re: [NFS] 2.6.23-rc1-mm2
On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote: To avoid a possible confusion: it is still OK if work-func() flushes its own workqueue, so strictly speaking this trace is false positive, but it would be very nice if we can get rid of this practice. I just had a thought: we could get rid of this warning by using a read-lock here. That way, flushing from within a work function (which would be seen as read-after-read recursive lock) won't trigger this warning. Patch below. This would, however, also get rid of any warnings for run_workqueue recursion. Which again we may or may not want, the code inidicates that it should be allowed up to a depth of three. However, the question whether we should allow flush_workqueue from within a struct work is mainly an API policy issue; it doesn't hurt to flush a workqueue from within a work, but it is probably nearer the intent to use targeted cancel_work_sync() or such. OTOH, one could imagine situations where multiple different work structs are on that workqueue belonging to the same subsystem and then the general flush_scheduled_work() call is the only way to guarantee nothing is on scheduled at a given point... I don't feel qualified to make the decision for or against allowing this use of the API at this point. Marc, do you have an easy way to trigger this warning? Could you verify that it goes away with the patch below applied? johannes --- kernel/workqueue.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) --- wireless-dev.orig/kernel/workqueue.c2007-08-06 08:11:23.297846657 +0200 +++ wireless-dev/kernel/workqueue.c 2007-08-06 08:19:54.727846657 +0200 @@ -272,7 +272,7 @@ static void run_workqueue(struct cpu_wor BUG_ON(get_wq_data(work) != cwq); work_clear_pending(work); - lock_acquire(cwq-wq-lockdep_map, 0, 0, 0, 2, _THIS_IP_); + lock_acquire(cwq-wq-lockdep_map, 0, 0, 1, 2, _THIS_IP_); lock_acquire(lockdep_map, 0, 0, 0, 2, _THIS_IP_); f(work); lock_release(lockdep_map, 1, _THIS_IP_); @@ -395,7 +395,7 @@ void fastcall flush_workqueue(struct wor int cpu; might_sleep(); - lock_acquire(wq-lockdep_map, 0, 0, 0, 2, _THIS_IP_); + lock_acquire(wq-lockdep_map, 0, 0, 1, 2, _THIS_IP_); lock_release(wq-lockdep_map, 1, _THIS_IP_); for_each_cpu_mask(cpu, *cpu_map) flush_cpu_workqueue(per_cpu_ptr(wq-cpu_wq, cpu)); @@ -779,7 +779,7 @@ static void cleanup_workqueue_thread(str if (cwq-thread == NULL) return; - lock_acquire(cwq-wq-lockdep_map, 0, 0, 0, 2, _THIS_IP_); + lock_acquire(cwq-wq-lockdep_map, 0, 0, 1, 2, _THIS_IP_); lock_release(cwq-wq-lockdep_map, 1, _THIS_IP_); flush_cpu_workqueue(cwq); signature.asc Description: This is a digitally signed message part
Re: [NFS] 2.6.23-rc1-mm2
On 08/06, Johannes Berg wrote: On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote: To avoid a possible confusion: it is still OK if work-func() flushes its own workqueue, so strictly speaking this trace is false positive, but it would be very nice if we can get rid of this practice. However, the question whether we should allow flush_workqueue from within a struct work is mainly an API policy issue; it doesn't hurt to flush a workqueue from within a work, I am not sure, but currently I hope we can forbid this eventually, so I personally think it is good that your patch complains. --- wireless-dev.orig/kernel/workqueue.c 2007-08-06 08:11:23.297846657 +0200 +++ wireless-dev/kernel/workqueue.c 2007-08-06 08:19:54.727846657 +0200 @@ -272,7 +272,7 @@ static void run_workqueue(struct cpu_wor BUG_ON(get_wq_data(work) != cwq); work_clear_pending(work); - lock_acquire(cwq-wq-lockdep_map, 0, 0, 0, 2, _THIS_IP_); + lock_acquire(cwq-wq-lockdep_map, 0, 0, 1, 2, _THIS_IP_); lock_acquire(lockdep_map, 0, 0, 0, 2, _THIS_IP_); f(work); lock_release(lockdep_map, 1, _THIS_IP_); @@ -395,7 +395,7 @@ void fastcall flush_workqueue(struct wor int cpu; might_sleep(); - lock_acquire(wq-lockdep_map, 0, 0, 0, 2, _THIS_IP_); + lock_acquire(wq-lockdep_map, 0, 0, 1, 2, _THIS_IP_); lock_release(wq-lockdep_map, 1, _THIS_IP_); for_each_cpu_mask(cpu, *cpu_map) flush_cpu_workqueue(per_cpu_ptr(wq-cpu_wq, cpu)); @@ -779,7 +779,7 @@ static void cleanup_workqueue_thread(str if (cwq-thread == NULL) return; - lock_acquire(cwq-wq-lockdep_map, 0, 0, 0, 2, _THIS_IP_); + lock_acquire(cwq-wq-lockdep_map, 0, 0, 1, 2, _THIS_IP_); lock_release(cwq-wq-lockdep_map, 1, _THIS_IP_); flush_cpu_workqueue(cwq); But this makes -lockdep_map meaningless? We always take wq-lockdep_map for reading, now we can't detect deadlocks. read_lock(A); lock(B); vs lock(B); read_lock(A); is valid, kernel/lockdep.c should not complain. No? Oleg. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [NFS] 2.6.23-rc1-mm2
On Mon, 2007-08-06 at 14:53 +0400, Oleg Nesterov wrote: But this makes -lockdep_map meaningless? We always take wq-lockdep_map for reading, now we can't detect deadlocks. read_lock(A); lock(B); vs lock(B); read_lock(A); is valid, kernel/lockdep.c should not complain. Ah, hmm. Good point, I guess you can always have multiple read locks. Then we'd have to make a new parameter or such to get rid of the recursive locking try message. But if you want to deprecate the API anyway then this is a good way to find it. johannes signature.asc Description: This is a digitally signed message part
Re: [NFS] 2.6.23-rc1-mm2
On Mon, 2007-08-06 at 13:05 +0200, Marc Dietrich wrote: I applied the patch, but now I get: = [ INFO: inconsistent lock state ] 2.6.23-rc1-mm2 #4 - inconsistent {softirq-on-W} - {in-softirq-W} usage. swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes: (rpc_credcache_lock){-+..}, at: [c01dc487] _atomic_dec_and_lock+0x17/0x60 Interesting, but doesn't seem related to this at all. As Oleg just pointed out this basically disabled checking for workqueue stuff so this should be looked into by somebody familiar with the NFS code. johannes signature.asc Description: This is a digitally signed message part
Re: [NFS] 2.6.23-rc1-mm2
Hi, Am Monday 06 August 2007 08:24 schrieb Johannes Berg: On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote: To avoid a possible confusion: it is still OK if work-func() flushes its own workqueue, so strictly speaking this trace is false positive, but it would be very nice if we can get rid of this practice. I just had a thought: we could get rid of this warning by using a read-lock here. That way, flushing from within a work function (which would be seen as read-after-read recursive lock) won't trigger this warning. Patch below. This would, however, also get rid of any warnings for run_workqueue recursion. Which again we may or may not want, the code inidicates that it should be allowed up to a depth of three. However, the question whether we should allow flush_workqueue from within a struct work is mainly an API policy issue; it doesn't hurt to flush a workqueue from within a work, but it is probably nearer the intent to use targeted cancel_work_sync() or such. OTOH, one could imagine situations where multiple different work structs are on that workqueue belonging to the same subsystem and then the general flush_scheduled_work() call is the only way to guarantee nothing is on scheduled at a given point... I don't feel qualified to make the decision for or against allowing this use of the API at this point. Marc, do you have an easy way to trigger this warning? Could you verify that it goes away with the patch below applied? just booting into X is enough. I applied the patch, but now I get: = [ INFO: inconsistent lock state ] 2.6.23-rc1-mm2 #4 - inconsistent {softirq-on-W} - {in-softirq-W} usage. swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes: (rpc_credcache_lock){-+..}, at: [c01dc487] _atomic_dec_and_lock+0x17/0x60 {softirq-on-W} state was registered at: [c013e870] __lock_acquire+0x650/0x1030 [c013f2b1] lock_acquire+0x61/0x80 [c02db9ac] _spin_lock+0x2c/0x40 [c01dc487] _atomic_dec_and_lock+0x17/0x60 [dced55fd] put_rpccred+0x5d/0x100 [sunrpc] [dced56c1] rpcauth_unbindcred+0x21/0x60 [sunrpc] [dced3fd4] a0 [sunrpc] [dcecefe0] rpc_call_sync+0x30/0x40 [sunrpc] [dcedc73b] rpcb_register+0xdb/0x180 [sunrpc] [dced65b3] svc_register+0x93/0x160 [sunrpc] [dced6ebe] __svc_create+0x1ee/0x220 [sunrpc] [dced7053] svc_create+0x13/0x20 [sunrpc] [dcf6d722] nfs_callback_up+0x82/0x120 [nfs] [dcf48f36] nfs_get_client+0x176/0x390 [nfs] [dcf49181] nfs4_set_client+0x31/0x190 [nfs] [dcf49983] nfs4_create_server+0x63/0x3b0 [nfs] [dcf52426] nfs4_get_sb+0x346/0x5b0 [nfs] [c017b444] vfs_kern_mount+0x94/0x110 [c0190a62] do_mount+0x1f2/0x7d0 [c01910a6] sys_mount+0x66/0xa0 [c0104046] syscall_call+0x7/0xb [] 0x irq event stamp: 5277830 hardirqs last enabled at (5277830): [c017530a] kmem_cache_free+0x8a/0xc0 hardirqs last disabled at (5277829): [c01752d2] kmem_cache_free+0x52/0xc0 softirqs last enabled at (5277798): [c0124173] __do_softirq+0xa3/0xc0 softirqs last disabled at (5277817): [c01241d7] do_softirq+0x47/0x50 other info that might help us debug this: no locks held by swapper/0. stack backtrace: [c0104fda] show_trace_log_lvl+0x1a/0x30 [c0105c02] show_trace+0x12/0x20 [c0105d15] dump_stack+0x15/0x20 [c013ccc3] print_usage_bug+0x153/0x160 [c013d8b9] mark_lock+0x449/0x620 [c013e824] __lock_acquire+0x604/0x1030 [c013f2b1] lock_acquire+0x61/0x80 [c02db9ac] _spin_lock+0x2c/0x40 [c01dc487] _atomic_dec_and_lock+0x17/0x60 [dced55fd] put_rpccred+0x5d/0x100 [sunrpc] [dcf6bf83] nfs_free_delegation_callback+0x13/0x20 [nfs] [c012f9ea] __rcu_process_callbacks+0x6a/0x1c0 [c012fb52] rcu_process_callbacks+0x12/0x30 [c0124218] tasklet_action+0x38/0x80 [c0124125] __do_softirq+0x55/0xc0 [c01241d7] do_softirq+0x47/0x50 [c0124605] irq_exit+0x35/0x40 [c0112463] smp_apic_timer_interrupt+0x43/0x80 [c0104a77] apic_timer_interrupt+0x33/0x38 [c02690df] cpuidle_idle_call+0x6f/0x90 [c01023c3] cpu_idle+0x43/0x70 [c02d8c27] rest_init+0x47/0x50 [c03bcb6a] start_kernel+0x22a/0x2b0 [] 0x0 === also, sometimes this kernel hangs because of nfs accessing processes remain in D state. Marc johannes --- kernel/workqueue.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) --- wireless-dev.orig/kernel/workqueue.c 2007-08-06 08:11:23.297846657 +0200 +++ wireless-dev/kernel/workqueue.c 2007-08-06 08:19:54.727846657 +0200 @@ -272,7 +272,7 @@ static void run_workqueue(struct cpu_wor BUG_ON(get_wq_data(work) != cwq); work_clear_pending(work); - lock_acquire(cwq-wq-lockdep_map, 0, 0, 0, 2, _THIS_IP_); + lock_acquire(cwq-wq-lockdep_map, 0, 0, 1, 2, _THIS_IP_); lock_acquire(lockdep_map, 0, 0, 0, 2, _THIS_IP_); f(work); lock_release(lockdep_map, 1, _THIS_IP_); @@ -395,7 +395,7 @@ void fastcall flush_workqueue(struct wor int cpu
Re: [linux-usb-devel] 2.6.23-rc1-mm2 + cpufreq patch + hot-fixes -- [f8ea528f] usb_stor_scan_thread+0xbd/0x15a [usb_storage]
On Sat, 4 Aug 2007, Miles Lane wrote: Initializing USB Mass Storage driver... usb-storage 4-3:1.0: usb_probe_interface usb-storage 4-3:1.0: usb_probe_interface - got id scsi2 : SCSI emulation for USB Mass Storage devices usbcore: registered new interface driver usb-storage usb-storage: device found at 2 usb-storage: waiting for device to settle before scanning schedule_timeout: wrong timeout value f8ea51d2 [c01080ab] show_trace_log_lvl+0x12/0x25 [c0108a9e] show_trace+0xd/0x10 [c0108bac] dump_stack+0x16/0x18 [c031e31e] schedule_timeout+0x2c/0x8b [f8ea528f] usb_stor_scan_thread+0xbd/0x15a [usb_storage] [c0139d64] kthread+0x3b/0x63 [c0107c63] kernel_thread_helper+0x7/0x10 === Does this happen repeatably? Did you set usb-storage's delay_use parameter to something peculiar? Alan Stern - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-usb-devel] 2.6.23-rc1-mm2 + cpufreq patch + hot-fixes -- [f8ea528f] usb_stor_scan_thread+0xbd/0x15a [usb_storage]
2007/8/6, Alan Stern [EMAIL PROTECTED]: On Sat, 4 Aug 2007, Miles Lane wrote: Initializing USB Mass Storage driver... usb-storage 4-3:1.0: usb_probe_interface usb-storage 4-3:1.0: usb_probe_interface - got id scsi2 : SCSI emulation for USB Mass Storage devices usbcore: registered new interface driver usb-storage usb-storage: device found at 2 usb-storage: waiting for device to settle before scanning schedule_timeout: wrong timeout value f8ea51d2 [c01080ab] show_trace_log_lvl+0x12/0x25 [c0108a9e] show_trace+0xd/0x10 [c0108bac] dump_stack+0x16/0x18 [c031e31e] schedule_timeout+0x2c/0x8b [f8ea528f] usb_stor_scan_thread+0xbd/0x15a [usb_storage] [c0139d64] kthread+0x3b/0x63 [c0107c63] kernel_thread_helper+0x7/0x10 === Does this happen repeatably? Did you set usb-storage's delay_use parameter to something peculiar? I also have same problem. It is caused by http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm2/broken-out/freezer-introduce-freezer-firendly-waiting-macros.patch The patch below may not be good fix. But it shows what is problem. Index: 2.6-mm/include/linux/freezer.h === --- 2.6-mm.orig/include/linux/freezer.h +++ 2.6-mm/include/linux/freezer.h @@ -149,13 +149,13 @@ static inline void set_freezable(void) #define wait_event_freezable_timeout(wq, condition, timeout) \ ({ \ - long __ret = timeout; \ + long ret = timeout; \ do {\ - __ret = wait_event_interruptible_timeout(wq,\ + ret = wait_event_interruptible_timeout(wq, \ (condition) || freezing(current), \ - __ret); \ + ret); \ } while (try_to_freeze()); \ - __ret; \ + ret;\ }) #else /* !CONFIG_PM_SLEEP */ static inline int frozen(struct task_struct *p) { return 0; } - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [NFS] 2.6.23-rc1-mm2
On Mon, 2007-08-06 at 13:05 +0200, Marc Dietrich wrote: Hi, Am Monday 06 August 2007 08:24 schrieb Johannes Berg: On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote: To avoid a possible confusion: it is still OK if work-func() flushes its own workqueue, so strictly speaking this trace is false positive, but it would be very nice if we can get rid of this practice. I just had a thought: we could get rid of this warning by using a read-lock here. That way, flushing from within a work function (which would be seen as read-after-read recursive lock) won't trigger this warning. Patch below. This would, however, also get rid of any warnings for run_workqueue recursion. Which again we may or may not want, the code inidicates that it should be allowed up to a depth of three. However, the question whether we should allow flush_workqueue from within a struct work is mainly an API policy issue; it doesn't hurt to flush a workqueue from within a work, but it is probably nearer the intent to use targeted cancel_work_sync() or such. OTOH, one could imagine situations where multiple different work structs are on that workqueue belonging to the same subsystem and then the general flush_scheduled_work() call is the only way to guarantee nothing is on scheduled at a given point... I don't feel qualified to make the decision for or against allowing this use of the API at this point. Marc, do you have an easy way to trigger this warning? Could you verify that it goes away with the patch below applied? just booting into X is enough. I applied the patch, but now I get: = [ INFO: inconsistent lock state ] 2.6.23-rc1-mm2 #4 - inconsistent {softirq-on-W} - {in-softirq-W} usage. swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes: (rpc_credcache_lock){-+..}, at: [c01dc487] _atomic_dec_and_lock+0x17/0x60 {softirq-on-W} state was registered at: [c013e870] __lock_acquire+0x650/0x1030 [c013f2b1] lock_acquire+0x61/0x80 [c02db9ac] _spin_lock+0x2c/0x40 [c01dc487] _atomic_dec_and_lock+0x17/0x60 [dced55fd] put_rpccred+0x5d/0x100 [sunrpc] [dced56c1] rpcauth_unbindcred+0x21/0x60 [sunrpc] [dced3fd4] a0 [sunrpc] [dcecefe0] rpc_call_sync+0x30/0x40 [sunrpc] [dcedc73b] rpcb_register+0xdb/0x180 [sunrpc] [dced65b3] svc_register+0x93/0x160 [sunrpc] [dced6ebe] __svc_create+0x1ee/0x220 [sunrpc] [dced7053] svc_create+0x13/0x20 [sunrpc] [dcf6d722] nfs_callback_up+0x82/0x120 [nfs] [dcf48f36] nfs_get_client+0x176/0x390 [nfs] [dcf49181] nfs4_set_client+0x31/0x190 [nfs] [dcf49983] nfs4_create_server+0x63/0x3b0 [nfs] [dcf52426] nfs4_get_sb+0x346/0x5b0 [nfs] [c017b444] vfs_kern_mount+0x94/0x110 [c0190a62] do_mount+0x1f2/0x7d0 [c01910a6] sys_mount+0x66/0xa0 [c0104046] syscall_call+0x7/0xb [] 0x irq event stamp: 5277830 hardirqs last enabled at (5277830): [c017530a] kmem_cache_free+0x8a/0xc0 hardirqs last disabled at (5277829): [c01752d2] kmem_cache_free+0x52/0xc0 softirqs last enabled at (5277798): [c0124173] __do_softirq+0xa3/0xc0 softirqs last disabled at (5277817): [c01241d7] do_softirq+0x47/0x50 other info that might help us debug this: no locks held by swapper/0. stack backtrace: [c0104fda] show_trace_log_lvl+0x1a/0x30 [c0105c02] show_trace+0x12/0x20 [c0105d15] dump_stack+0x15/0x20 [c013ccc3] print_usage_bug+0x153/0x160 [c013d8b9] mark_lock+0x449/0x620 [c013e824] __lock_acquire+0x604/0x1030 [c013f2b1] lock_acquire+0x61/0x80 [c02db9ac] _spin_lock+0x2c/0x40 [c01dc487] _atomic_dec_and_lock+0x17/0x60 [dced55fd] put_rpccred+0x5d/0x100 [sunrpc] [dcf6bf83] nfs_free_delegation_callback+0x13/0x20 [nfs] [c012f9ea] __rcu_process_callbacks+0x6a/0x1c0 [c012fb52] rcu_process_callbacks+0x12/0x30 [c0124218] tasklet_action+0x38/0x80 [c0124125] __do_softirq+0x55/0xc0 [c01241d7] do_softirq+0x47/0x50 [c0124605] irq_exit+0x35/0x40 [c0112463] smp_apic_timer_interrupt+0x43/0x80 [c0104a77] apic_timer_interrupt+0x33/0x38 [c02690df] cpuidle_idle_call+0x6f/0x90 [c01023c3] cpu_idle+0x43/0x70 [c02d8c27] rest_init+0x47/0x50 [c03bcb6a] start_kernel+0x22a/0x2b0 [] 0x0 === That is a different matter. I assume this patch should suffice to fix the above problem. Trond ---BeginMessage--- Doing so would require us to introduce bh-safe locks into put_rpccred(). Signed-off-by: Trond Myklebust [EMAIL PROTECTED] --- fs/nfs/delegation.c | 21 +++-- 1 files changed, 15 insertions(+), 6 deletions(-) diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c index 20ac403..c55a761 100644 --- a/fs/nfs/delegation.c +++ b/fs/nfs/delegation.c @@ -20,10 +20,8 @@ #include delegation.h #include internal.h -static void nfs_free_delegation(struct nfs_delegation *delegation) +static void nfs_do_free_delegation(struct nfs_delegation *delegation) { - if (delegation-cred
Re: 2.6.23-rc1-mm2
Second issue as reported earilier allmodconfig fails to build on imac g3. CC arch/powerpc/kernel/lparmap.s AS arch/powerpc/kernel/head_64.o lparmap.c: Assembler messages: lparmap.c:84: Error: file number 1 already allocated make[1]: *** [arch/powerpc/kernel/head_64.o] Blad 1 make: *** [arch/powerpc/kernel] Blad 2 Please send me the full output of: gcc --version(or whatever your gcc is called) ld --version ld --help(I know no better way to get the supported binutils targets, and the default target) and the lparmap.s file. You might want to skip sending it to the lists, it will be a bit big (and off-topic on most of those lists, anyway). Segher - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
On 2 aug 2007, at 12:14, Mariusz Kozlowski wrote: Second issue as reported earilier allmodconfig fails to build on imac g3. Do you really mean g3? If so it's a 32-bit kernel and it shouldn't be building lparmap.s. It might be a bug nevertheless, there are more issues with the interesting way lparmap.s is built and used. Segher - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
Some how your defconfig is targeting a PPC64 box: CONFIG_PPC64=y shouldn't be set if you want to build a kernel for a G3 imac. allyesconfig/allmodconfig select a 64-bit build always. Maybe it shouldn't. Segher - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
Second issue as reported earilier allmodconfig fails to build on imac g3. CC arch/powerpc/kernel/lparmap.s AS arch/powerpc/kernel/head_64.o lparmap.c: Assembler messages: lparmap.c:84: Error: file number 1 already allocated make[1]: *** [arch/powerpc/kernel/head_64.o] Blad 1 make: *** [arch/powerpc/kernel] Blad 2 Please send me the full output of: gcc --version(or whatever your gcc is called) ld --version ld --help(I know no better way to get the supported binutils targets, and the default target) and the lparmap.s file. You might want to skip sending it to the lists, it will be a bit big (and off-topic on most of those lists, anyway). Well ... its 66kB. Not that bad. Please find it attached. Needed gcc and ld info below. Thanks. It seems like things go wrong when lparmap.s is generated with (DWARF) debug info; could you try building it (manually) with -g0 added on the end of the compile line, and see if head_64.o compiles okay for you then? If so, I'll prepare a proper patch for it, I have a similar one (also for lparmap!) in my queue already... Segher - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
Second issue as reported earilier allmodconfig fails to build on imac g3. CC arch/powerpc/kernel/lparmap.s AS arch/powerpc/kernel/head_64.o lparmap.c: Assembler messages: lparmap.c:84: Error: file number 1 already allocated make[1]: *** [arch/powerpc/kernel/head_64.o] Blad 1 make: *** [arch/powerpc/kernel] Blad 2 Please send me the full output of: gcc --version(or whatever your gcc is called) ld --version ld --help(I know no better way to get the supported binutils targets, and the default target) and the lparmap.s file. You might want to skip sending it to the lists, it will be a bit big (and off-topic on most of those lists, anyway). Well ... its 66kB. Not that bad. Please find it attached. Needed gcc and ld info below. Thanks. It seems like things go wrong when lparmap.s is generated with (DWARF) debug info; could you try building it (manually) with -g0 added on the end of the compile line, and see if head_64.o compiles okay for you then? If so, I'll prepare a proper patch for it, I have a similar one (also for lparmap!) in my queue already... Ok it worked. I had to add -g0 to Makefile under arch/powerpc/kernel because -g0 was added before -g and didn't have any effect when adding to Makefile in top dir. But yes - it compiles now. Thanks, Mariusz - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
It seems like things go wrong when lparmap.s is generated with (DWARF) debug info; could you try building it (manually) with -g0 added on the end of the compile line, and see if head_64.o compiles okay for you then? If so, I'll prepare a proper patch for it, I have a similar one (also for lparmap!) in my queue already... Ok it worked. I had to add -g0 to Makefile under arch/powerpc/kernel because -g0 was added before -g and didn't have any effect when adding to Makefile in top dir. Yeah, that's why I said build lparmap.s manually :-) But yes - it compiles now. Great, I'll combine it with my other lparmap build patch then. Thanks for the report and testing! Segher - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.23-rc1-mm2 + cpufreq patch + hot-fixes -- [] usb_stor_scan_thread+0xbd/0x15a [usb_storage]
usb usb4: usb resume ehci_hcd :00:1d.7: resume root hub hub 4-0:1.0: hub_resume hub 4-0:1.0: state 7 ports 6 chg evt ehci_hcd :00:1d.7: GetStatus port 3 status 001803 POWER sig=j CSC CONNECT hub 4-0:1.0: port 3, status 0501, change 0001, 480 Mb/s hub 4-0:1.0: debounce: port 3: total 100ms stable 100ms status 0x501 ehci_hcd :00:1d.7: port 3 high speed ehci_hcd :00:1d.7: GetStatus port 3 status 001005 POWER sig=se0 PE CONNECT usb 4-3: new high speed USB device using ehci_hcd and address 2 ehci_hcd :00:1d.7: port 3 high speed ehci_hcd :00:1d.7: GetStatus port 3 status 001005 POWER sig=se0 PE CONNECT usb 4-3: default language 0x0409 usb 4-3: new device found, idVendor=0781, idProduct=5150 usb 4-3: new device strings: Mfr=1, Product=2, SerialNumber=3 usb 4-3: Product: Cruzer Mini usb 4-3: Manufacturer: SanDisk Corporation usb 4-3: SerialNumber: SNDK5C31950CBB106202 usb 4-3: uevent usb 4-3: usb_probe_device usb 4-3: configuration #1 chosen from 1 choice usb 4-3: adding 4-3:1.0 (config #1, interface 0) usb 4-3:1.0: uevent usb 4-3:1.0: uevent drivers/usb/core/inode.c: creating file '002' Initializing USB Mass Storage driver... usb-storage 4-3:1.0: usb_probe_interface usb-storage 4-3:1.0: usb_probe_interface - got id scsi2 : SCSI emulation for USB Mass Storage devices usbcore: registered new interface driver usb-storage usb-storage: device found at 2 usb-storage: waiting for device to settle before scanning schedule_timeout: wrong timeout value f8ea51d2 [] show_trace_log_lvl+0x12/0x25 [] show_trace+0xd/0x10 [] dump_stack+0x16/0x18 [] schedule_timeout+0x2c/0x8b [] usb_stor_scan_thread+0xbd/0x15a [usb_storage] [] kthread+0x3b/0x63 [] kernel_thread_helper+0x7/0x10 === USB Mass Storage support registered. scsi 2:0:0:0: Direct-Access SanDisk Cruzer Mini 0.4 PQ: 0 ANSI: 2 sd 2:0:0:0: [sdb] 2001888 512-byte hardware sectors (1025 MB) sd 2:0:0:0: [sdb] Write Protect is off sd 2:0:0:0: [sdb] Mode Sense: 03 00 00 00 sd 2:0:0:0: [sdb] Assuming drive cache: write through sd 2:0:0:0: [sdb] 2001888 512-byte hardware sectors (1025 MB) sd 2:0:0:0: [sdb] Write Protect is off sd 2:0:0:0: [sdb] Mode Sense: 03 00 00 00 sd 2:0:0:0: [sdb] Assuming drive cache: write through sdb: sdb1 sd 2:0:0:0: [sdb] Attached SCSI removable disk sd 2:0:0:0: Attached scsi generic sg2 type 0 usb-storage: device scan complete - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.23-rc1-mm2 + cpufreq patch + hot-fixes -- [f8ea528f] usb_stor_scan_thread+0xbd/0x15a [usb_storage]
usb usb4: usb resume ehci_hcd :00:1d.7: resume root hub hub 4-0:1.0: hub_resume hub 4-0:1.0: state 7 ports 6 chg evt ehci_hcd :00:1d.7: GetStatus port 3 status 001803 POWER sig=j CSC CONNECT hub 4-0:1.0: port 3, status 0501, change 0001, 480 Mb/s hub 4-0:1.0: debounce: port 3: total 100ms stable 100ms status 0x501 ehci_hcd :00:1d.7: port 3 high speed ehci_hcd :00:1d.7: GetStatus port 3 status 001005 POWER sig=se0 PE CONNECT usb 4-3: new high speed USB device using ehci_hcd and address 2 ehci_hcd :00:1d.7: port 3 high speed ehci_hcd :00:1d.7: GetStatus port 3 status 001005 POWER sig=se0 PE CONNECT usb 4-3: default language 0x0409 usb 4-3: new device found, idVendor=0781, idProduct=5150 usb 4-3: new device strings: Mfr=1, Product=2, SerialNumber=3 usb 4-3: Product: Cruzer Mini usb 4-3: Manufacturer: SanDisk Corporation usb 4-3: SerialNumber: SNDK5C31950CBB106202 usb 4-3: uevent usb 4-3: usb_probe_device usb 4-3: configuration #1 chosen from 1 choice usb 4-3: adding 4-3:1.0 (config #1, interface 0) usb 4-3:1.0: uevent usb 4-3:1.0: uevent drivers/usb/core/inode.c: creating file '002' Initializing USB Mass Storage driver... usb-storage 4-3:1.0: usb_probe_interface usb-storage 4-3:1.0: usb_probe_interface - got id scsi2 : SCSI emulation for USB Mass Storage devices usbcore: registered new interface driver usb-storage usb-storage: device found at 2 usb-storage: waiting for device to settle before scanning schedule_timeout: wrong timeout value f8ea51d2 [c01080ab] show_trace_log_lvl+0x12/0x25 [c0108a9e] show_trace+0xd/0x10 [c0108bac] dump_stack+0x16/0x18 [c031e31e] schedule_timeout+0x2c/0x8b [f8ea528f] usb_stor_scan_thread+0xbd/0x15a [usb_storage] [c0139d64] kthread+0x3b/0x63 [c0107c63] kernel_thread_helper+0x7/0x10 === USB Mass Storage support registered. scsi 2:0:0:0: Direct-Access SanDisk Cruzer Mini 0.4 PQ: 0 ANSI: 2 sd 2:0:0:0: [sdb] 2001888 512-byte hardware sectors (1025 MB) sd 2:0:0:0: [sdb] Write Protect is off sd 2:0:0:0: [sdb] Mode Sense: 03 00 00 00 sd 2:0:0:0: [sdb] Assuming drive cache: write through sd 2:0:0:0: [sdb] 2001888 512-byte hardware sectors (1025 MB) sd 2:0:0:0: [sdb] Write Protect is off sd 2:0:0:0: [sdb] Mode Sense: 03 00 00 00 sd 2:0:0:0: [sdb] Assuming drive cache: write through sdb: sdb1 sd 2:0:0:0: [sdb] Attached SCSI removable disk sd 2:0:0:0: Attached scsi generic sg2 type 0 usb-storage: device scan complete - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [NFS] 2.6.23-rc1-mm2
On 08/03, Trond Myklebust wrote: > > On Fri, 2007-08-03 at 09:38 -0700, Andrew Morton wrote: > > > stack backtrace: > > > [] show_trace_log_lvl+0x1a/0x30 > > > [] show_trace+0x12/0x20 > > > [] dump_stack+0x15/0x20 > > > [] __lock_acquire+0xc22/0x1030 > > > [] lock_acquire+0x61/0x80 > > > [] flush_workqueue+0x49/0x70 > > > [] flush_scheduled_work+0xd/0x10 > > > [] nfs_release_automount_timer+0x2c/0x30 [nfs] > > > [] nfs_free_server+0x9e/0xd0 [nfs] > > > [] nfs_kill_super+0x16/0x20 [nfs] > > > [] deactivate_super+0x7d/0xa0 > > > [] mntput_no_expire+0x4b/0x80 > > > [] expire_mount_list+0xe4/0x140 > > > [] mark_mounts_for_expiry+0x99/0xb0 > > > [] nfs_expire_automounts+0xd/0x40 [nfs] > > > [] run_workqueue+0x12b/0x1e0 > > > [] worker_thread+0x9b/0x100 > > > [] kthread+0x42/0x70 > > > [] kernel_thread_helper+0x7/0x18 > > > === > > > > > > > There is new debugging stuff in -mm: deadlockable usage of workqueue > > primitives will now trigger lockdep warnings. See > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm2/broken-out/workqueue-debug-flushing-deadlocks-with-lockdep.patch > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm2/broken-out/workqueue-debug-work-related-deadlocks-with-lockdep.patch > > > > I am suspecting that running flush_scheduled_work() from within > > run_workqueue() > > isn't good. > > I'll have a look at this. I suspect that most if not all of our calls to > run_workqueue()/flush_scheduled_work() can now be replaced by more > targeted calls to cancel_work_sync() and cancel_delayed_work_sync(). Yes, please, if possible. To avoid a possible confusion: it is still OK if work->func() flushes its own workqueue, so strictly speaking this trace is false positive, but it would be very nice if we can get rid of this practice. Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [NFS] 2.6.23-rc1-mm2
On Fri, 2007-08-03 at 09:38 -0700, Andrew Morton wrote: > On Fri, 3 Aug 2007 13:00:46 +0200 Marc Dietrich <[EMAIL PROTECTED]> wrote: > > > > > Hi, > > > > Am Wednesday 01 August 2007 08:09 schrieb Andrew Morton: > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2. > > >6.23-rc1-mm2/ > > > > > > > > > - the git-block tree remains dropped due to disageement with the Vaio > > > > > > - git-e1000new was withdrawn by the authors > > > > > > - git-wireless is back. It is still a >3MB diff, and appears to compile. > > > > > > - Is anyone testing the kgdb code in here? > > > > I still get some nfs related locking bug. > > > > I applied > > > > linux-2.6.23-001-fix_rpciod_down_race.dif > > linux-2.6.23-003-fix_locking_regression.dif > > linux-2.6.23-004-fix_stateid_regression.dif > > > > = > > [ INFO: possible recursive locking detected ] > > 2.6.23-rc1-mm2 #3 > > - > > events/0/5 is trying to acquire lock: > > (events){--..}, at: [] flush_workqueue+0x0/0x70 > > > > but task is already holding lock: > > (events){--..}, at: [] run_workqueue+0xd4/0x1e0 > > > > other info that might help us debug this: > > 2 locks held by events/0/5: > > #0: (events){--..}, at: [] run_workqueue+0xd4/0x1e0 > > #1: ((nfs_automount_task).work){--..}, at: [] > > run_workqueue+0xd4/0x1e0 > > > > stack backtrace: > > [] show_trace_log_lvl+0x1a/0x30 > > [] show_trace+0x12/0x20 > > [] dump_stack+0x15/0x20 > > [] __lock_acquire+0xc22/0x1030 > > [] lock_acquire+0x61/0x80 > > [] flush_workqueue+0x49/0x70 > > [] flush_scheduled_work+0xd/0x10 > > [] nfs_release_automount_timer+0x2c/0x30 [nfs] > > [] nfs_free_server+0x9e/0xd0 [nfs] > > [] nfs_kill_super+0x16/0x20 [nfs] > > [] deactivate_super+0x7d/0xa0 > > [] mntput_no_expire+0x4b/0x80 > > [] expire_mount_list+0xe4/0x140 > > [] mark_mounts_for_expiry+0x99/0xb0 > > [] nfs_expire_automounts+0xd/0x40 [nfs] > > [] run_workqueue+0x12b/0x1e0 > > [] worker_thread+0x9b/0x100 > > [] kthread+0x42/0x70 > > [] kernel_thread_helper+0x7/0x18 > > === > > > > There is new debugging stuff in -mm: deadlockable usage of workqueue > primitives will now trigger lockdep warnings. See > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm2/broken-out/workqueue-debug-flushing-deadlocks-with-lockdep.patch > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm2/broken-out/workqueue-debug-work-related-deadlocks-with-lockdep.patch > > I am suspecting that running flush_scheduled_work() from within > run_workqueue() > isn't good. I'll have a look at this. I suspect that most if not all of our calls to run_workqueue()/flush_scheduled_work() can now be replaced by more targeted calls to cancel_work_sync() and cancel_delayed_work_sync(). Trond - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
On Fri, 3 Aug 2007 13:00:46 +0200 Marc Dietrich <[EMAIL PROTECTED]> wrote: > > Hi, > > Am Wednesday 01 August 2007 08:09 schrieb Andrew Morton: > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2. > >6.23-rc1-mm2/ > > > > > > - the git-block tree remains dropped due to disageement with the Vaio > > > > - git-e1000new was withdrawn by the authors > > > > - git-wireless is back. It is still a >3MB diff, and appears to compile. > > > > - Is anyone testing the kgdb code in here? > > I still get some nfs related locking bug. > > I applied > > linux-2.6.23-001-fix_rpciod_down_race.dif > linux-2.6.23-003-fix_locking_regression.dif > linux-2.6.23-004-fix_stateid_regression.dif > > = > [ INFO: possible recursive locking detected ] > 2.6.23-rc1-mm2 #3 > - > events/0/5 is trying to acquire lock: > (events){--..}, at: [] flush_workqueue+0x0/0x70 > > but task is already holding lock: > (events){--..}, at: [] run_workqueue+0xd4/0x1e0 > > other info that might help us debug this: > 2 locks held by events/0/5: > #0: (events){--..}, at: [] run_workqueue+0xd4/0x1e0 > #1: ((nfs_automount_task).work){--..}, at: [] > run_workqueue+0xd4/0x1e0 > > stack backtrace: > [] show_trace_log_lvl+0x1a/0x30 > [] show_trace+0x12/0x20 > [] dump_stack+0x15/0x20 > [] __lock_acquire+0xc22/0x1030 > [] lock_acquire+0x61/0x80 > [] flush_workqueue+0x49/0x70 > [] flush_scheduled_work+0xd/0x10 > [] nfs_release_automount_timer+0x2c/0x30 [nfs] > [] nfs_free_server+0x9e/0xd0 [nfs] > [] nfs_kill_super+0x16/0x20 [nfs] > [] deactivate_super+0x7d/0xa0 > [] mntput_no_expire+0x4b/0x80 > [] expire_mount_list+0xe4/0x140 > [] mark_mounts_for_expiry+0x99/0xb0 > [] nfs_expire_automounts+0xd/0x40 [nfs] > [] run_workqueue+0x12b/0x1e0 > [] worker_thread+0x9b/0x100 > [] kthread+0x42/0x70 > [] kernel_thread_helper+0x7/0x18 > =========== > There is new debugging stuff in -mm: deadlockable usage of workqueue primitives will now trigger lockdep warnings. See ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm2/broken-out/workqueue-debug-flushing-deadlocks-with-lockdep.patch ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm2/broken-out/workqueue-debug-work-related-deadlocks-with-lockdep.patch I am suspecting that running flush_scheduled_work() from within run_workqueue() isn't good. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.23-rc1-mm2 + cpufreq patch -- another "inconsistent {in-hardirq-W} -> {hardirq-on-W} usage."
When I ran "modprobe -r ipw2200" I got: = [ INFO: inconsistent lock state ] 2.6.23-rc1-mm2 #21 - inconsistent {in-hardirq-W} -> {hardirq-on-W} usage. modprobe/6888 [HC0[0]:SC0[0]:HE1:SE1] takes: (>irq_lock){++..}, at: [] ipw_isr+0x1c/0xa9 [ipw2200] {in-hardirq-W} state was registered at: [] __lock_acquire+0x430/0xbca [] lock_acquire+0x76/0x9d [] _spin_lock+0x23/0x32 [] ipw_isr+0x1c/0xa9 [ipw2200] [] handle_IRQ_event+0x1a/0x4f [] handle_fasteoi_irq+0x7d/0xb6 [] do_IRQ+0xaf/0xd9 [] 0x irq event stamp: 136973 hardirqs last enabled at (136973): [] kfree+0xc7/0xdb hardirqs last disabled at (136972): [] kfree+0x67/0xdb softirqs last enabled at (136286): [] __do_softirq+0xf5/0xfb softirqs last disabled at (136277): [] do_softirq+0x74/0xed other info that might help us debug this: no locks held by modprobe/6888. stack backtrace: [] show_trace_log_lvl+0x12/0x25 [] show_trace+0xd/0x10 [] dump_stack+0x16/0x18 [] print_usage_bug+0x107/0x114 [] mark_lock+0x1e9/0x400 [] __lock_acquire+0x4a4/0xbca [] lock_acquire+0x76/0x9d [] _spin_lock+0x23/0x32 [] ipw_isr+0x1c/0xa9 [ipw2200] [] free_irq+0xc9/0xf2 [] ipw_pci_remove+0x189/0x1c9 [ipw2200] [] pci_device_remove+0x19/0x39 [] __device_release_driver+0x74/0x90 [] driver_detach+0xa2/0xe0 [] bus_remove_driver+0x5d/0x79 [] driver_unregister+0x8/0xa [] pci_unregister_driver+0x13/0x55 [] ipw_exit+0x1c/0x1e [ipw2200] [] sys_delete_module+0x1c6/0x237 [] sysenter_past_esp+0x6b/0xb5 === - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
Hi, Am Wednesday 01 August 2007 08:09 schrieb Andrew Morton: > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2. >6.23-rc1-mm2/ > > > - the git-block tree remains dropped due to disageement with the Vaio > > - git-e1000new was withdrawn by the authors > > - git-wireless is back. It is still a >3MB diff, and appears to compile. > > - Is anyone testing the kgdb code in here? I still get some nfs related locking bug. I applied linux-2.6.23-001-fix_rpciod_down_race.dif linux-2.6.23-003-fix_locking_regression.dif linux-2.6.23-004-fix_stateid_regression.dif = [ INFO: possible recursive locking detected ] 2.6.23-rc1-mm2 #3 - events/0/5 is trying to acquire lock: (events){--..}, at: [] flush_workqueue+0x0/0x70 but task is already holding lock: (events){--..}, at: [] run_workqueue+0xd4/0x1e0 other info that might help us debug this: 2 locks held by events/0/5: #0: (events){--..}, at: [] run_workqueue+0xd4/0x1e0 #1: ((nfs_automount_task).work){--..}, at: [] run_workqueue+0xd4/0x1e0 stack backtrace: [] show_trace_log_lvl+0x1a/0x30 [] show_trace+0x12/0x20 [] dump_stack+0x15/0x20 [] __lock_acquire+0xc22/0x1030 [] lock_acquire+0x61/0x80 [] flush_workqueue+0x49/0x70 [] flush_scheduled_work+0xd/0x10 [] nfs_release_automount_timer+0x2c/0x30 [nfs] [] nfs_free_server+0x9e/0xd0 [nfs] [] nfs_kill_super+0x16/0x20 [nfs] [] deactivate_super+0x7d/0xa0 [] mntput_no_expire+0x4b/0x80 [] expire_mount_list+0xe4/0x140 [] mark_mounts_for_expiry+0x99/0xb0 [] nfs_expire_automounts+0xd/0x40 [nfs] [] run_workqueue+0x12b/0x1e0 [] worker_thread+0x9b/0x100 [] kthread+0x42/0x70 [] kernel_thread_helper+0x7/0x18 === thanks Marc -- "The enemy uses unauthorized weapons." Lord Arthur Ponsonby, "Falsehood in Wartime: Propaganda Lies of the First World War", 1928 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
On Aug 2, 2007, at 5:14 AM, Mariusz Kozlowski wrote: Second issue as reported earilier allmodconfig fails to build on imac g3. Do you really mean g3? If so it's a 32-bit kernel and it shouldn't be building lparmap.s. Or do you mean G5? Yes it is iMac G3. More or less sth like this: http://upload.wikimedia.org/wikipedia/commons/c/c0/IMac_Bondi_Blue.jpg processor : 0 cpu : 740/750 temperature : 47-49 C (uncalibrated) clock : 400MHz revision: 2.2 (pvr 0008 0202) bogomips: 796.67 machine : PowerMac2,1 motherboard : PowerMac2,1 MacRISC2 MacRISC Power Macintosh detected as : 66 (iMac FireWire) pmac flags : 0005 L2 cache: 512K unified memory : 256MB pmac-generation : NewWorld CC arch/powerpc/kernel/lparmap.s AS arch/powerpc/kernel/head_64.o lparmap.c: Assembler messages: lparmap.c:84: Error: file number 1 already allocated make[1]: *** [arch/powerpc/kernel/head_64.o] Blad 1 make: *** [arch/powerpc/kernel] Blad 2 Weird. Could you do make V=1 and send me the output? Ok. Here it goes. The last screen. If you need all / more feel free to mail me. Config is attached - please note that this is default allmodconfig. gcc -m64 -Wp,-MD,arch/powerpc/kernel/.machine_kexec.o.d -nostdinc - isystem /usr/lib/gcc/powerpc-linux-gnu/4.1.2/include -D__KERNEL__ - Iinclude -include include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno- trigraphs -fno-strict-aliasing -fno-common -Werror-implicit- function-declaration -Os -msoft-float -pipe -mminimal-toc - mtraceback=none -mcall-aixdesc -mcpu=power4 -mno-altivec -funit-at- a-time -mno-string -Wa,-maltivec -fomit-frame-pointer -g -fno- stack-protector -Wdeclaration-after-statement -Wno-pointer-sign - mno-minimal-toc -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR (machine_kexec)" -D"KBUILD_MODNAME=KBUILD_STR(machine_kexec)" -c -o arch/powerpc/kernel/.tmp_machine_kexec.o arch/powerpc/kernel/ machine_kexec.c gcc -m64 -Wp,-MD,arch/powerpc/kernel/.crash.o.d -nostdinc - isystem /usr/lib/gcc/powerpc-linux-gnu/4.1.2/include -D__KERNEL__ - Iinclude -include include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno- trigraphs -fno-strict-aliasing -fno-common -Werror-implicit- function-declaration -Os -msoft-float -pipe -mminimal-toc - mtraceback=none -mcall-aixdesc -mcpu=power4 -mno-altivec -funit-at- a-time -mno-string -Wa,-maltivec -fomit-frame-pointer -g -fno- stack-protector -Wdeclaration-after-statement -Wno-pointer-sign - mno-minimal-toc -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR (crash)" -D"KBUILD_MODNAME=KBUILD_STR(crash)" -c -o arch/powerpc/kernel/.tmp_crash.o arch/powerpc/kernel/crash.c gcc -m64 -Wp,-MD,arch/powerpc/kernel/.machine_kexec_64.o.d - nostdinc -isystem /usr/lib/gcc/powerpc-linux-gnu/4.1.2/include - D__KERNEL__ -Iinclude -include include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno- trigraphs -fno-strict-aliasing -fno-common -Werror-implicit- function-declaration -Os -msoft-float -pipe -mminimal-toc - mtraceback=none -mcall-aixdesc -mcpu=power4 -mno-altivec -funit-at- a-time -mno-string -Wa,-maltivec -fomit-frame-pointer -g -fno- stack-protector -Wdeclaration-after-statement -Wno-pointer-sign - mno-minimal-toc -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR (machine_kexec_64)" -D"KBUILD_MODNAME=KBUILD_STR (machine_kexec_64)" -c -o arch/powerpc/kernel/.tmp_machine_kexec_64.o arch/powerpc/kernel/machine_kexec_64.c gcc -m64 -Wp,-MD,arch/powerpc/kernel/.audit.o.d -nostdinc - isystem /usr/lib/gcc/powerpc-linux-gnu/4.1.2/include -D__KERNEL__ - Iinclude -include include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno- trigraphs -fno-strict-aliasing -fno-common -Werror-implicit- function-declaration -Os -msoft-float -pipe -mminimal-toc - mtraceback=none -mcall-aixdesc -mcpu=power4 -mno-altivec -funit-at- a-time -mno-string -Wa,-maltivec -fomit-frame-pointer -g -fno- stack-protector -Wdeclaration-after-statement -Wno-pointer-sign - mno-minimal-toc -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR (audit)" -D"KBUILD_MODNAME=KBUILD_STR(audit)" -c -o arch/powerpc/kernel/.tmp_audit.o arch/powerpc/kernel/audit.c gcc -m64 -Wp,-MD,arch/powerpc/kernel/.swsusp_64.o.d -nostdinc - isystem /usr/lib/gcc/powerpc-linux-gnu/4.1.2/include -D__KERNEL__ - Iinclude -include include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno- trigraphs -fno-strict-aliasing -fno-common -Werror-implicit- function-declaration -Os -msoft-float -pipe -mminimal-toc - mtraceback=none -mcall-aixdesc -mcpu=power4 -mno-altivec -funit-at- a-time -mno-string -Wa,-maltivec -fomit-frame-pointer -g -fno- stack-protector -Wdeclaration-after-statement -Wno-pointer-sign - mno-minimal-toc -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR (swsusp_64)" -D"KBUILD_MODNAME=KBUILD_STR(swsusp_64)" -c -o arch/powerpc/kernel/.tmp_swsusp_64.o
2.6.23-rc1-mm2 + cpufreq patch -- another inconsistent {in-hardirq-W} - {hardirq-on-W} usage.
When I ran modprobe -r ipw2200 I got: = [ INFO: inconsistent lock state ] 2.6.23-rc1-mm2 #21 - inconsistent {in-hardirq-W} - {hardirq-on-W} usage. modprobe/6888 [HC0[0]:SC0[0]:HE1:SE1] takes: (priv-irq_lock){++..}, at: [f8ce9b1c] ipw_isr+0x1c/0xa9 [ipw2200] {in-hardirq-W} state was registered at: [c0144837] __lock_acquire+0x430/0xbca [c0145047] lock_acquire+0x76/0x9d [c031ffe0] _spin_lock+0x23/0x32 [f8ce9b1c] ipw_isr+0x1c/0xa9 [ipw2200] [c0154b44] handle_IRQ_event+0x1a/0x4f [c0155c6c] handle_fasteoi_irq+0x7d/0xb6 [c0109a5a] do_IRQ+0xaf/0xd9 [] 0x irq event stamp: 136973 hardirqs last enabled at (136973): [c0173232] kfree+0xc7/0xdb hardirqs last disabled at (136972): [c01731d2] kfree+0x67/0xdb softirqs last enabled at (136286): [c012d8a1] __do_softirq+0xf5/0xfb softirqs last disabled at (136277): [c0109932] do_softirq+0x74/0xed other info that might help us debug this: no locks held by modprobe/6888. stack backtrace: [c01080ab] show_trace_log_lvl+0x12/0x25 [c0108a9e] show_trace+0xd/0x10 [c0108bac] dump_stack+0x16/0x18 [c0143374] print_usage_bug+0x107/0x114 [c0143bdd] mark_lock+0x1e9/0x400 [c01448ab] __lock_acquire+0x4a4/0xbca [c0145047] lock_acquire+0x76/0x9d [c031ffe0] _spin_lock+0x23/0x32 [f8ce9b1c] ipw_isr+0x1c/0xa9 [ipw2200] [c015506f] free_irq+0xc9/0xf2 [f8cea09c] ipw_pci_remove+0x189/0x1c9 [ipw2200] [c0206dec] pci_device_remove+0x19/0x39 [c026a23a] __device_release_driver+0x74/0x90 [c026a6b6] driver_detach+0xa2/0xe0 [c0269dc5] bus_remove_driver+0x5d/0x79 [c026a71b] driver_unregister+0x8/0xa [c0206f56] pci_unregister_driver+0x13/0x55 [f8cead75] ipw_exit+0x1c/0x1e [ipw2200] [c014c85f] sys_delete_module+0x1c6/0x237 [c0106f02] sysenter_past_esp+0x6b/0xb5 === - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [NFS] 2.6.23-rc1-mm2
On Fri, 2007-08-03 at 09:38 -0700, Andrew Morton wrote: On Fri, 3 Aug 2007 13:00:46 +0200 Marc Dietrich [EMAIL PROTECTED] wrote: Hi, Am Wednesday 01 August 2007 08:09 schrieb Andrew Morton: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2. 6.23-rc1-mm2/ - the git-block tree remains dropped due to disageement with the Vaio - git-e1000new was withdrawn by the authors - git-wireless is back. It is still a 3MB diff, and appears to compile. - Is anyone testing the kgdb code in here? I still get some nfs related locking bug. I applied linux-2.6.23-001-fix_rpciod_down_race.dif linux-2.6.23-003-fix_locking_regression.dif linux-2.6.23-004-fix_stateid_regression.dif = [ INFO: possible recursive locking detected ] 2.6.23-rc1-mm2 #3 - events/0/5 is trying to acquire lock: (events){--..}, at: [c012ed90] flush_workqueue+0x0/0x70 but task is already holding lock: (events){--..}, at: [c012e5c4] run_workqueue+0xd4/0x1e0 other info that might help us debug this: 2 locks held by events/0/5: #0: (events){--..}, at: [c012e5c4] run_workqueue+0xd4/0x1e0 #1: ((nfs_automount_task).work){--..}, at: [c012e5c4] run_workqueue+0xd4/0x1e0 stack backtrace: [c0104fda] show_trace_log_lvl+0x1a/0x30 [c0105c02] show_trace+0x12/0x20 [c0105d15] dump_stack+0x15/0x20 [c013ee42] __lock_acquire+0xc22/0x1030 [c013f2b1] lock_acquire+0x61/0x80 [c012edd9] flush_workqueue+0x49/0x70 [c012ee0d] flush_scheduled_work+0xd/0x10 [dcf55c0c] nfs_release_automount_timer+0x2c/0x30 [nfs] [dcf45d8e] nfs_free_server+0x9e/0xd0 [nfs] [dcf4e626] nfs_kill_super+0x16/0x20 [nfs] [c017b38d] deactivate_super+0x7d/0xa0 [c018f94b] mntput_no_expire+0x4b/0x80 [c018fd94] expire_mount_list+0xe4/0x140 [c0191219] mark_mounts_for_expiry+0x99/0xb0 [dcf55d1d] nfs_expire_automounts+0xd/0x40 [nfs] [c012e61b] run_workqueue+0x12b/0x1e0 [c012f05b] worker_thread+0x9b/0x100 [c0131c72] kthread+0x42/0x70 [c0104c0f] kernel_thread_helper+0x7/0x18 === There is new debugging stuff in -mm: deadlockable usage of workqueue primitives will now trigger lockdep warnings. See ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm2/broken-out/workqueue-debug-flushing-deadlocks-with-lockdep.patch ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm2/broken-out/workqueue-debug-work-related-deadlocks-with-lockdep.patch I am suspecting that running flush_scheduled_work() from within run_workqueue() isn't good. I'll have a look at this. I suspect that most if not all of our calls to run_workqueue()/flush_scheduled_work() can now be replaced by more targeted calls to cancel_work_sync() and cancel_delayed_work_sync(). Trond - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
On Aug 2, 2007, at 5:14 AM, Mariusz Kozlowski wrote: Second issue as reported earilier allmodconfig fails to build on imac g3. Do you really mean g3? If so it's a 32-bit kernel and it shouldn't be building lparmap.s. Or do you mean G5? Yes it is iMac G3. More or less sth like this: http://upload.wikimedia.org/wikipedia/commons/c/c0/IMac_Bondi_Blue.jpg processor : 0 cpu : 740/750 temperature : 47-49 C (uncalibrated) clock : 400MHz revision: 2.2 (pvr 0008 0202) bogomips: 796.67 machine : PowerMac2,1 motherboard : PowerMac2,1 MacRISC2 MacRISC Power Macintosh detected as : 66 (iMac FireWire) pmac flags : 0005 L2 cache: 512K unified memory : 256MB pmac-generation : NewWorld CC arch/powerpc/kernel/lparmap.s AS arch/powerpc/kernel/head_64.o lparmap.c: Assembler messages: lparmap.c:84: Error: file number 1 already allocated make[1]: *** [arch/powerpc/kernel/head_64.o] Blad 1 make: *** [arch/powerpc/kernel] Blad 2 Weird. Could you do make V=1 and send me the output? Ok. Here it goes. The last screen. If you need all / more feel free to mail me. Config is attached - please note that this is default allmodconfig. gcc -m64 -Wp,-MD,arch/powerpc/kernel/.machine_kexec.o.d -nostdinc - isystem /usr/lib/gcc/powerpc-linux-gnu/4.1.2/include -D__KERNEL__ - Iinclude -include include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno- trigraphs -fno-strict-aliasing -fno-common -Werror-implicit- function-declaration -Os -msoft-float -pipe -mminimal-toc - mtraceback=none -mcall-aixdesc -mcpu=power4 -mno-altivec -funit-at- a-time -mno-string -Wa,-maltivec -fomit-frame-pointer -g -fno- stack-protector -Wdeclaration-after-statement -Wno-pointer-sign - mno-minimal-toc -DKBUILD_STR(s)=#s -DKBUILD_BASENAME=KBUILD_STR (machine_kexec) -DKBUILD_MODNAME=KBUILD_STR(machine_kexec) -c -o arch/powerpc/kernel/.tmp_machine_kexec.o arch/powerpc/kernel/ machine_kexec.c gcc -m64 -Wp,-MD,arch/powerpc/kernel/.crash.o.d -nostdinc - isystem /usr/lib/gcc/powerpc-linux-gnu/4.1.2/include -D__KERNEL__ - Iinclude -include include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno- trigraphs -fno-strict-aliasing -fno-common -Werror-implicit- function-declaration -Os -msoft-float -pipe -mminimal-toc - mtraceback=none -mcall-aixdesc -mcpu=power4 -mno-altivec -funit-at- a-time -mno-string -Wa,-maltivec -fomit-frame-pointer -g -fno- stack-protector -Wdeclaration-after-statement -Wno-pointer-sign - mno-minimal-toc -DKBUILD_STR(s)=#s -DKBUILD_BASENAME=KBUILD_STR (crash) -DKBUILD_MODNAME=KBUILD_STR(crash) -c -o arch/powerpc/kernel/.tmp_crash.o arch/powerpc/kernel/crash.c gcc -m64 -Wp,-MD,arch/powerpc/kernel/.machine_kexec_64.o.d - nostdinc -isystem /usr/lib/gcc/powerpc-linux-gnu/4.1.2/include - D__KERNEL__ -Iinclude -include include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno- trigraphs -fno-strict-aliasing -fno-common -Werror-implicit- function-declaration -Os -msoft-float -pipe -mminimal-toc - mtraceback=none -mcall-aixdesc -mcpu=power4 -mno-altivec -funit-at- a-time -mno-string -Wa,-maltivec -fomit-frame-pointer -g -fno- stack-protector -Wdeclaration-after-statement -Wno-pointer-sign - mno-minimal-toc -DKBUILD_STR(s)=#s -DKBUILD_BASENAME=KBUILD_STR (machine_kexec_64) -DKBUILD_MODNAME=KBUILD_STR (machine_kexec_64) -c -o arch/powerpc/kernel/.tmp_machine_kexec_64.o arch/powerpc/kernel/machine_kexec_64.c gcc -m64 -Wp,-MD,arch/powerpc/kernel/.audit.o.d -nostdinc - isystem /usr/lib/gcc/powerpc-linux-gnu/4.1.2/include -D__KERNEL__ - Iinclude -include include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno- trigraphs -fno-strict-aliasing -fno-common -Werror-implicit- function-declaration -Os -msoft-float -pipe -mminimal-toc - mtraceback=none -mcall-aixdesc -mcpu=power4 -mno-altivec -funit-at- a-time -mno-string -Wa,-maltivec -fomit-frame-pointer -g -fno- stack-protector -Wdeclaration-after-statement -Wno-pointer-sign - mno-minimal-toc -DKBUILD_STR(s)=#s -DKBUILD_BASENAME=KBUILD_STR (audit) -DKBUILD_MODNAME=KBUILD_STR(audit) -c -o arch/powerpc/kernel/.tmp_audit.o arch/powerpc/kernel/audit.c gcc -m64 -Wp,-MD,arch/powerpc/kernel/.swsusp_64.o.d -nostdinc - isystem /usr/lib/gcc/powerpc-linux-gnu/4.1.2/include -D__KERNEL__ - Iinclude -include include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno- trigraphs -fno-strict-aliasing -fno-common -Werror-implicit- function-declaration -Os -msoft-float -pipe -mminimal-toc - mtraceback=none -mcall-aixdesc -mcpu=power4 -mno-altivec -funit-at- a-time -mno-string -Wa,-maltivec -fomit-frame-pointer -g -fno- stack-protector -Wdeclaration-after-statement -Wno-pointer-sign - mno-minimal-toc -DKBUILD_STR(s)=#s -DKBUILD_BASENAME=KBUILD_STR (swsusp_64) -DKBUILD_MODNAME=KBUILD_STR(swsusp_64) -c -o arch/powerpc/kernel/.tmp_swsusp_64.o arch/powerpc/kernel/swsusp_64.c gcc -m64
Re: 2.6.23-rc1-mm2
On Fri, 3 Aug 2007 13:00:46 +0200 Marc Dietrich [EMAIL PROTECTED] wrote: Hi, Am Wednesday 01 August 2007 08:09 schrieb Andrew Morton: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2. 6.23-rc1-mm2/ - the git-block tree remains dropped due to disageement with the Vaio - git-e1000new was withdrawn by the authors - git-wireless is back. It is still a 3MB diff, and appears to compile. - Is anyone testing the kgdb code in here? I still get some nfs related locking bug. I applied linux-2.6.23-001-fix_rpciod_down_race.dif linux-2.6.23-003-fix_locking_regression.dif linux-2.6.23-004-fix_stateid_regression.dif = [ INFO: possible recursive locking detected ] 2.6.23-rc1-mm2 #3 - events/0/5 is trying to acquire lock: (events){--..}, at: [c012ed90] flush_workqueue+0x0/0x70 but task is already holding lock: (events){--..}, at: [c012e5c4] run_workqueue+0xd4/0x1e0 other info that might help us debug this: 2 locks held by events/0/5: #0: (events){--..}, at: [c012e5c4] run_workqueue+0xd4/0x1e0 #1: ((nfs_automount_task).work){--..}, at: [c012e5c4] run_workqueue+0xd4/0x1e0 stack backtrace: [c0104fda] show_trace_log_lvl+0x1a/0x30 [c0105c02] show_trace+0x12/0x20 [c0105d15] dump_stack+0x15/0x20 [c013ee42] __lock_acquire+0xc22/0x1030 [c013f2b1] lock_acquire+0x61/0x80 [c012edd9] flush_workqueue+0x49/0x70 [c012ee0d] flush_scheduled_work+0xd/0x10 [dcf55c0c] nfs_release_automount_timer+0x2c/0x30 [nfs] [dcf45d8e] nfs_free_server+0x9e/0xd0 [nfs] [dcf4e626] nfs_kill_super+0x16/0x20 [nfs] [c017b38d] deactivate_super+0x7d/0xa0 [c018f94b] mntput_no_expire+0x4b/0x80 [c018fd94] expire_mount_list+0xe4/0x140 [c0191219] mark_mounts_for_expiry+0x99/0xb0 [dcf55d1d] nfs_expire_automounts+0xd/0x40 [nfs] [c012e61b] run_workqueue+0x12b/0x1e0 [c012f05b] worker_thread+0x9b/0x100 [c0131c72] kthread+0x42/0x70 [c0104c0f] kernel_thread_helper+0x7/0x18 === There is new debugging stuff in -mm: deadlockable usage of workqueue primitives will now trigger lockdep warnings. See ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm2/broken-out/workqueue-debug-flushing-deadlocks-with-lockdep.patch ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm2/broken-out/workqueue-debug-work-related-deadlocks-with-lockdep.patch I am suspecting that running flush_scheduled_work() from within run_workqueue() isn't good. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
Hi, Am Wednesday 01 August 2007 08:09 schrieb Andrew Morton: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2. 6.23-rc1-mm2/ - the git-block tree remains dropped due to disageement with the Vaio - git-e1000new was withdrawn by the authors - git-wireless is back. It is still a 3MB diff, and appears to compile. - Is anyone testing the kgdb code in here? I still get some nfs related locking bug. I applied linux-2.6.23-001-fix_rpciod_down_race.dif linux-2.6.23-003-fix_locking_regression.dif linux-2.6.23-004-fix_stateid_regression.dif = [ INFO: possible recursive locking detected ] 2.6.23-rc1-mm2 #3 - events/0/5 is trying to acquire lock: (events){--..}, at: [c012ed90] flush_workqueue+0x0/0x70 but task is already holding lock: (events){--..}, at: [c012e5c4] run_workqueue+0xd4/0x1e0 other info that might help us debug this: 2 locks held by events/0/5: #0: (events){--..}, at: [c012e5c4] run_workqueue+0xd4/0x1e0 #1: ((nfs_automount_task).work){--..}, at: [c012e5c4] run_workqueue+0xd4/0x1e0 stack backtrace: [c0104fda] show_trace_log_lvl+0x1a/0x30 [c0105c02] show_trace+0x12/0x20 [c0105d15] dump_stack+0x15/0x20 [c013ee42] __lock_acquire+0xc22/0x1030 [c013f2b1] lock_acquire+0x61/0x80 [c012edd9] flush_workqueue+0x49/0x70 [c012ee0d] flush_scheduled_work+0xd/0x10 [dcf55c0c] nfs_release_automount_timer+0x2c/0x30 [nfs] [dcf45d8e] nfs_free_server+0x9e/0xd0 [nfs] [dcf4e626] nfs_kill_super+0x16/0x20 [nfs] [c017b38d] deactivate_super+0x7d/0xa0 [c018f94b] mntput_no_expire+0x4b/0x80 [c018fd94] expire_mount_list+0xe4/0x140 [c0191219] mark_mounts_for_expiry+0x99/0xb0 [dcf55d1d] nfs_expire_automounts+0xd/0x40 [nfs] [c012e61b] run_workqueue+0x12b/0x1e0 [c012f05b] worker_thread+0x9b/0x100 [c0131c72] kthread+0x42/0x70 [c0104c0f] kernel_thread_helper+0x7/0x18 === thanks Marc -- The enemy uses unauthorized weapons. Lord Arthur Ponsonby, Falsehood in Wartime: Propaganda Lies of the First World War, 1928 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [NFS] 2.6.23-rc1-mm2
On 08/03, Trond Myklebust wrote: On Fri, 2007-08-03 at 09:38 -0700, Andrew Morton wrote: stack backtrace: [c0104fda] show_trace_log_lvl+0x1a/0x30 [c0105c02] show_trace+0x12/0x20 [c0105d15] dump_stack+0x15/0x20 [c013ee42] __lock_acquire+0xc22/0x1030 [c013f2b1] lock_acquire+0x61/0x80 [c012edd9] flush_workqueue+0x49/0x70 [c012ee0d] flush_scheduled_work+0xd/0x10 [dcf55c0c] nfs_release_automount_timer+0x2c/0x30 [nfs] [dcf45d8e] nfs_free_server+0x9e/0xd0 [nfs] [dcf4e626] nfs_kill_super+0x16/0x20 [nfs] [c017b38d] deactivate_super+0x7d/0xa0 [c018f94b] mntput_no_expire+0x4b/0x80 [c018fd94] expire_mount_list+0xe4/0x140 [c0191219] mark_mounts_for_expiry+0x99/0xb0 [dcf55d1d] nfs_expire_automounts+0xd/0x40 [nfs] [c012e61b] run_workqueue+0x12b/0x1e0 [c012f05b] worker_thread+0x9b/0x100 [c0131c72] kthread+0x42/0x70 [c0104c0f] kernel_thread_helper+0x7/0x18 === There is new debugging stuff in -mm: deadlockable usage of workqueue primitives will now trigger lockdep warnings. See ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm2/broken-out/workqueue-debug-flushing-deadlocks-with-lockdep.patch ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm2/broken-out/workqueue-debug-work-related-deadlocks-with-lockdep.patch I am suspecting that running flush_scheduled_work() from within run_workqueue() isn't good. I'll have a look at this. I suspect that most if not all of our calls to run_workqueue()/flush_scheduled_work() can now be replaced by more targeted calls to cancel_work_sync() and cancel_delayed_work_sync(). Yes, please, if possible. To avoid a possible confusion: it is still OK if work-func() flushes its own workqueue, so strictly speaking this trace is false positive, but it would be very nice if we can get rid of this practice. Oleg. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.23-rc1-mm2 + cpufreq patch -- inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
I am running Ubuntu Gutsy with latest updates. When I run "/etc/init.d/networking stop" with my custom kernel, I get: = [ INFO: inconsistent lock state ] 2.6.23-rc1-mm2 #21 - inconsistent {in-hardirq-W} -> {hardirq-on-W} usage. ifconfig/8982 [HC0[0]:SC0[0]:HE1:SE1] takes: (>lock){++..}, at: [] rtl8139_interrupt+0x22/0x377 [8139too] {in-hardirq-W} state was registered at: [] __lock_acquire+0x430/0xbca [] lock_acquire+0x76/0x9d [] _spin_lock+0x23/0x32 [] rtl8139_interrupt+0x22/0x377 [8139too] [] handle_IRQ_event+0x1a/0x4f [] handle_fasteoi_irq+0x7d/0xb6 [] do_IRQ+0xaf/0xd9 [] 0x irq event stamp: 1501 hardirqs last enabled at (1501): [] kfree+0xc7/0xdb hardirqs last disabled at (1500): [] kfree+0x67/0xdb softirqs last enabled at (1480): [] dev_deactivate+0x87/0xa0 softirqs last disabled at (1478): [] _spin_lock_bh+0xb/0x37 other info that might help us debug this: 1 lock held by ifconfig/8982: #0: (rtnl_mutex){--..}, at: [] mutex_lock+0x1c/0x1f stack backtrace: [] show_trace_log_lvl+0x12/0x25 [] show_trace+0xd/0x10 [] dump_stack+0x16/0x18 [] print_usage_bug+0x107/0x114 [] mark_lock+0x1e9/0x400 [] __lock_acquire+0x4a4/0xbca [] lock_acquire+0x76/0x9d [] _spin_lock+0x23/0x32 [] rtl8139_interrupt+0x22/0x377 [8139too] [] free_irq+0xc9/0xf2 [] rtl8139_close+0xac/0x14a [8139too] [] dev_close+0x4e/0x6b [] dev_change_flags+0x9f/0x152 [] devinet_ioctl+0x209/0x506 [] inet_ioctl+0x86/0xa4 [] sock_ioctl+0x1a9/0x1c7 [] do_ioctl+0x22/0x67 [] vfs_ioctl+0x249/0x25c [] sys_ioctl+0x2c/0x45 [] sysenter_past_esp+0x6b/0xb5 === eth2: link down - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2: Fix crash in sysfs_hash_and_remove
Tejun Heo <[EMAIL PROTECTED]> writes: > Rafael J. Wysocki wrote: >> From: Rafael J. Wysocki <[EMAIL PROTECTED]> >> >> My test box crashes during suspend, while the nonboot CPUs are being >> disabled, >> because sysfs_hash_and_remove() doesn't check if dir_sd passed to it is not >> NULL. Fix it. >> >> Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> > > It got broken when shadow support was added. The shadow support in -mm1 > will be dropped and Eric is preparing a new version. So, this fix > probably won't be necessary from -mm2. Agreed. That check is in my current development tree. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
On 8/2/07, Andy Whitcroft <[EMAIL PROTECTED]> wrote: > vmemmap x86_64: ensure end of section memmap is initialised > > Similar to the generic initialisers, the x86_64 vmemmap > initialisation may incorrectly skip the last page of a section if > the section start is not aligned to the page. > > Where we have a section spanning the end of a PMD we will check the > start of the section at A populating it. We will then move on 1 > PMD page to C and find ourselves beyond the end of the section which > ends at B we will complete without checking the second PMD page. > > | PMD | PMD | > | SECTION | > A B C > > We should round ourselves to the end of the PMD as we iterate. > > Signed-off-by: Andy Whitcroft <[EMAIL PROTECTED]> > --- > arch/x86_64/mm/init.c |9 + > 1 files changed, 5 insertions(+), 4 deletions(-) > diff --git a/arch/x86_64/mm/init.c b/arch/x86_64/mm/init.c > index ac49df0..5d1ed03 100644 > --- a/arch/x86_64/mm/init.c > +++ b/arch/x86_64/mm/init.c > @@ -792,9 +792,10 @@ int __meminit vmemmap_populate_pmd(pud_t *pud, unsigned > long addr, > unsigned long end, int node) > { > pmd_t *pmd; > + unsigned long next; > > - for (pmd = pmd_offset(pud, addr); addr < end; > - pmd++, addr += PMD_SIZE) > + for (pmd = pmd_offset(pud, addr); addr < end; pmd++, addr = next) { > + next = pmd_addr_end(addr, end); > if (pmd_none(*pmd)) { > pte_t entry; > void *p = vmemmap_alloc_block(PMD_SIZE, node); > @@ -808,8 +809,8 @@ int __meminit vmemmap_populate_pmd(pud_t *pud, unsigned > long addr, > printk(KERN_DEBUG " [%lx-%lx] PMD ->%p on node %d\n", > addr, addr + PMD_SIZE - 1, p, node); > } else > - vmemmap_verify((pte_t *)pmd, node, > - pmd_addr_end(addr, end), end); > + vmemmap_verify((pte_t *)pmd, node, next, end); > + } > return 0; > } > #endif > That patch applied to 2.6.23-rc1-mm2 boots. But I still the the MP-BIOS bug, now with an additional Call Trace: [ 27.034907] ACPI: Core revision 20070126 [ 27.082090] ..MP-BIOS bug: 8254 timer not connected to IO-APIC [ 27.132617] WARNING: at kernel/irq/resend.c:69 check_irq_resend() [ 27.150837] [ 27.150837] Call Trace: [ 27.162621] [] check_irq_resend+0xbc/0xd0 [ 27.179558] [] enable_irq+0xf0/0x100 [ 27.195177] [] setup_IO_APIC+0x6c4/0x9a0 [ 27.211833] [] set_cpus_allowed+0x64/0xc0 [ 27.228749] [] smp_prepare_cpus+0x434/0x460 [ 27.246183] [] kernel_init+0x67/0x350 [ 27.262062] [] child_rip+0xa/0x12 [ 27.276928] [] acpi_ds_init_one_object+0x0/0x7c [ 27.295425] [] kernel_init+0x0/0x350 [ 27.311043] [] child_rip+0x0/0x12 [ 27.325881] [ 27.463199] Using local APIC timer interrupts. [ 27.514874] result 12500129 [ 27.523240] Detected 12.500 MHz APIC timer. It does no longer seem to matter if it was a warm or cold start. Otherwise the system seems to be working normal. Torsten - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2: Fix crash in sysfs_hash_and_remove
Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <[EMAIL PROTECTED]> > > My test box crashes during suspend, while the nonboot CPUs are being disabled, > because sysfs_hash_and_remove() doesn't check if dir_sd passed to it is not > NULL. Fix it. > > Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> It got broken when shadow support was added. The shadow support in -mm1 will be dropped and Eric is preparing a new version. So, this fix probably won't be necessary from -mm2. Thanks. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2: Fix crash in sysfs_hash_and_remove
From: Rafael J. Wysocki <[EMAIL PROTECTED]> My test box crashes during suspend, while the nonboot CPUs are being disabled, because sysfs_hash_and_remove() doesn't check if dir_sd passed to it is not NULL. Fix it. Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> --- fs/sysfs/inode.c |2 ++ 1 file changed, 2 insertions(+) Index: linux-2.6.23-rc1-mm2/fs/sysfs/inode.c === --- linux-2.6.23-rc1-mm2.orig/fs/sysfs/inode.c +++ linux-2.6.23-rc1-mm2/fs/sysfs/inode.c @@ -191,6 +191,8 @@ int sysfs_hash_and_remove(struct kobject struct sysfs_addrm_cxt acxt; struct sysfs_dirent **pos, *sd; + if (!dir_sd) + return -ENOENT; sysfs_addrm_start(, dir_sd); if (!sysfs_resolve_for_remove(kobj, _sd)) goto addrm_finish; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: unionfs compile error ( Re: 2.6.23-rc1-mm2 )
In message <[EMAIL PROTECTED]>, Josef Sipek writes: > On Wed, Aug 01, 2007 at 10:22:07AM -0700, Andrew Morton wrote: > > On Wed, 01 Aug 2007 12:33:18 +0200 Gabriel C <[EMAIL PROTECTED]> wrote: > > > > > > > > > > > fs/unionfs/file.c:147: error: 'file_fsync' undeclared here (not in a > > > function) > > > make[2]: *** [fs/unionfs/file.o] Error 1 > > > make[1]: *** [fs/unionfs] Error 2 > > > make: *** [fs] Error 2 > > > make: *** Waiting for unfinished jobs > > > > > > ... > > > > > > Config can be found there -> http://194.231.229.228/MM/config-auto-3 > > > > > > > This, I assume: > > > > --- a/fs/unionfs/file.c~git-unionfs-fix-2 > > +++ a/fs/unionfs/file.c > > @@ -17,6 +17,7 @@ > > */ > > > > #include "union.h" > > +#include > > > > /*** > > * File Operations * > > _ > > > > (and no, sorry, I will not be complicit in that > > single-header-file-which-includes-the-whole-world junk). > > Ouch. I had a fix for this, and it managed to get lost in the pile of > patches. > > I'll fix it up and push fix to kernel.org. Jeff, make sure you push my fix which will work even if CONFIG_BLOCK=n is set. > Jeff. Erez. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-usb-devel] 2.6.23-rc1-mm2
On Thu, 2 Aug 2007, Alan Stern wrote: > > > > > uhci_hcd :00:0c.0: dma_pool_free buffer-32, 6b6b6b6b/6b6b6b6b > > > > > (bad dma) > > > I guess the patch below (which I have just added to my tree) fixes that, > > > right? Thanks. > > Yes - that's correct. This patch fixes the bug. Thanks. > Does it also fix the "dma_pool_free" error? I believe it should -- caused by calling usb_buffer_free() with bogus dma_addr_t, as corresponding usbhid_device has been already kfree()d. -- Jiri Kosina - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-usb-devel] 2.6.23-rc1-mm2
> > > > > === > > > > > uhci_hcd :00:0c.0: dma_pool_free buffer-32, 6b6b6b6b/6b6b6b6b > > > > > (bad dma) > > > > > > Mariusz, > > > > > > I guess the patch below (which I have just added to my tree) fixes > > > that, right? Thanks. > > > > Yes - that's correct. This patch fixes the bug. Thanks. > > Does it also fix the "dma_pool_free" error? Yes - it does. Regards, Mariusz - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-usb-devel] 2.6.23-rc1-mm2
On Thu, 2 Aug 2007, Mariusz Kozlowski wrote: > > > > === > > > > uhci_hcd :00:0c.0: dma_pool_free buffer-32, 6b6b6b6b/6b6b6b6b (bad > > > > dma) > > > > Mariusz, > > > > I guess the patch below (which I have just added to my tree) fixes that, > > right? Thanks. > > Yes - that's correct. This patch fixes the bug. Thanks. Does it also fix the "dma_pool_free" error? Alan Stern - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
On Thu, Aug 02, 2007 at 12:40:59AM +0100, Mel Gorman wrote: > On (01/08/07 22:52), Torsten Kaiser didst pronounce: > > On 8/1/07, Andrew Morton <[EMAIL PROTECTED]> wrote: > > > On Wed, 01 Aug 2007 16:30:08 -0400 > > > [EMAIL PROTECTED] wrote: > > > > > > > As an aside, it looks like bits of dynticks-for-x86_64 are in > > > > there. > > > > In particular, x86_64-enable-high-resolution-timers-and-dynticks.patch > > > > is in > > > > there, adding a menu that depends on GENERIC_CLOCKEVENTS, but then > > > > nothing > > > > in the x86_64 tree actually *sets* it. There's a few other > > > > dynticks-related > > > > prep patches in there as well. Does this mean it's back to "coming > > > > soon to > > > > a CPU near you" status? :) > > > > > > I've lost the plot on that stuff: I'm just leaving things as-is for now, > > > wait for Thomas to return from vacation so we can have another run at it. > > > > For what its worth: 2.6.22-rc6-mm1 with NO_HZ works for me on an AMD > > SMP system without trouble. > > > > Next try with 2.6.23-rc1-mm2 and SPARSEMEM: > > Probably the same exception, but this time with Call Trace: > > [0.00] Bootmem setup node 0 -8000 > > [0.00] Bootmem setup node 1 8000-00012000 > > [0.00] Zone PFN ranges: > > [0.00] DMA 0 -> 4096 > > [0.00] DMA324096 -> 1048576 > > [0.00] Normal1048576 -> 1179648 > > [0.00] Movable zone start PFN for each node > > [0.00] early_node_map[4] active PFN ranges > > [0.00] 0:0 -> 159 > > [0.00] 0: 256 -> 524288 > > [0.00] 1: 524288 -> 917488 > > [0.00] 1: 1048576 -> 1179648 > > PANIC: early exception rip 807cddb5 error 2 cr2 e2000310 > > [0.00] > > [0.00] Call Trace: > > [0.00] [] memmap_init_zone+0xb5/0x130 > > [0.00] [] init_currently_empty_zone+0x84/0x110 > > [0.00] [] free_area_init_node+0x393/0x3e0 > > [0.00] [] free_area_init_nodes+0x2da/0x320 > > [0.00] [] paging_init+0x87/0x90 > > [0.00] [] setup_arch+0x355/0x470 > > [0.00] [] start_kernel+0x57/0x330 > > [0.00] [] _sinittext+0x12d/0x140 > > [0.00] > > [0.00] RIP memmap_init_zone+0xb5/0x130 > > > > (gdb) list *0x807cddb5 > > 0x807cddb5 is in memmap_init_zone (include/linux/list.h:32). > > 27 #define LIST_HEAD(name) \ > > 28 struct list_head name = LIST_HEAD_INIT(name) > > 29 > > 30 static inline void INIT_LIST_HEAD(struct list_head *list) > > 31 { > > 32 list->next = list; > > 33 list->prev = list; > > 34 } > > 35 > > 36 /* > > > > I will test more tomorrow... > > Well That doesn't make a whole pile of sense unless the memory map > is not present. Looking at your boot log, we see this gem This implies that >lru is invalid. Which implies that the memory map is indeed not present. However, if we look at the code in detail we have actually already updated several fields in the struct page already. Particularly we have already updated the flags, _count, and _mapcount. It is when we touch lru which we blammo. All of the good entries are in the first 24 bytes of the struct page, lru is in the 8th 64bit word, or +64 bytes. Looking at the faulting address it is e2000310, ie the fault is 16 bytes into a page. So the first three elements of this struct page are in one PMD mapped page, and the lru the next. As this has SPARSEMEM_VMEMMAP enabled that implies that the vemmmap has not been filled out correctly. Looking at the x86_64 initialiser it appears that we have the same bug that Kame-san reported against the generic initialisers. At the end of this email is a proposed patch for this, could you apply that to a clean 2.6.23-rc1-mm2 tree and give it a test for me. I have boot tested this on our x86_64 boxes, but they happen to be sized and layed out to not trip this bug. Let me know if it fixes things up for you and I will push it upstream. If this patch does not fix it could you please get us a boot log at loglevel=8 of an unmodified 2.6.23-rc1-mm2 kernel, this should give sufficient debug on how the vmemmap is initialised. > > [0.00] 1: 524288 -> 917488 > > [0.00] 1: 1048576 -> 1179648 [...] -a
Re: 2.6.23-rc1-mm2
Hello, > > > usb 2-1: USB disconnect, address 2 > > > BUG: atomic counter underflow at: > > > [] show_trace_log_lvl+0x1a/0x30 > > > [] show_trace+0x12/0x14 > > > [] dump_stack+0x15/0x17 > > > [] __free_pages+0x50/0x52 > > > [] free_pages+0x1f/0x21 > > > [] dma_free_coherent+0x43/0x9c > > > [] hcd_buffer_free+0x43/0x6a > > > [] usb_buffer_free+0x23/0x29 > > > [] hid_free_buffers+0x23/0x71 > > > [] hid_disconnect+0xb0/0xc8 > > > [] usb_unbind_interface+0x30/0x72 > > > [] __device_release_driver+0x6a/0x92 > > > [] device_release_driver+0x20/0x36 > > > [] bus_remove_device+0x62/0x85 > > > [] device_del+0x16d/0x27c > > > [] usb_disable_device+0x7a/0xe2 > > > [] usb_disconnect+0x94/0xde > > > [] hub_thread+0x2fe/0xc1b > > > [] kthread+0x36/0x58 > > > [] kernel_thread_helper+0x7/0x14 > > > === > > > uhci_hcd :00:0c.0: dma_pool_free buffer-32, 6b6b6b6b/6b6b6b6b (bad > > > dma) > > Mariusz, > > I guess the patch below (which I have just added to my tree) fixes that, > right? Thanks. Yes - that's correct. This patch fixes the bug. Thanks. Mariusz - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
Hello, usb 2-1: USB disconnect, address 2 BUG: atomic counter underflow at: [c010456a] show_trace_log_lvl+0x1a/0x30 [c010508d] show_trace+0x12/0x14 [c01051e0] dump_stack+0x15/0x17 [c01418cf] __free_pages+0x50/0x52 [c01418f0] free_pages+0x1f/0x21 [c010783d] dma_free_coherent+0x43/0x9c [c0315067] hcd_buffer_free+0x43/0x6a [c030b2b4] usb_buffer_free+0x23/0x29 [c0346db4] hid_free_buffers+0x23/0x71 [c0346eb2] hid_disconnect+0xb0/0xc8 [c0313676] usb_unbind_interface+0x30/0x72 [c02c6df0] __device_release_driver+0x6a/0x92 [c02c71c3] device_release_driver+0x20/0x36 [c02c6736] bus_remove_device+0x62/0x85 [c02c49f8] device_del+0x16d/0x27c [c0310f25] usb_disable_device+0x7a/0xe2 [c030d0bc] usb_disconnect+0x94/0xde [c030e030] hub_thread+0x2fe/0xc1b [c0128aee] kthread+0x36/0x58 [c0104233] kernel_thread_helper+0x7/0x14 === uhci_hcd :00:0c.0: dma_pool_free buffer-32, 6b6b6b6b/6b6b6b6b (bad dma) Mariusz, I guess the patch below (which I have just added to my tree) fixes that, right? Thanks. Yes - that's correct. This patch fixes the bug. Thanks. Mariusz - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
On Thu, Aug 02, 2007 at 12:40:59AM +0100, Mel Gorman wrote: On (01/08/07 22:52), Torsten Kaiser didst pronounce: On 8/1/07, Andrew Morton [EMAIL PROTECTED] wrote: On Wed, 01 Aug 2007 16:30:08 -0400 [EMAIL PROTECTED] wrote: As an aside, it looks like bitspieces of dynticks-for-x86_64 are in there. In particular, x86_64-enable-high-resolution-timers-and-dynticks.patch is in there, adding a menu that depends on GENERIC_CLOCKEVENTS, but then nothing in the x86_64 tree actually *sets* it. There's a few other dynticks-related prep patches in there as well. Does this mean it's back to coming soon to a CPU near you status? :) I've lost the plot on that stuff: I'm just leaving things as-is for now, wait for Thomas to return from vacation so we can have another run at it. For what its worth: 2.6.22-rc6-mm1 with NO_HZ works for me on an AMD SMP system without trouble. Next try with 2.6.23-rc1-mm2 and SPARSEMEM: Probably the same exception, but this time with Call Trace: [0.00] Bootmem setup node 0 -8000 [0.00] Bootmem setup node 1 8000-00012000 [0.00] Zone PFN ranges: [0.00] DMA 0 - 4096 [0.00] DMA324096 - 1048576 [0.00] Normal1048576 - 1179648 [0.00] Movable zone start PFN for each node [0.00] early_node_map[4] active PFN ranges [0.00] 0:0 - 159 [0.00] 0: 256 - 524288 [0.00] 1: 524288 - 917488 [0.00] 1: 1048576 - 1179648 PANIC: early exception rip 807cddb5 error 2 cr2 e2000310 [0.00] [0.00] Call Trace: [0.00] [807cddb5] memmap_init_zone+0xb5/0x130 [0.00] [807ce874] init_currently_empty_zone+0x84/0x110 [0.00] [807cec93] free_area_init_node+0x393/0x3e0 [0.00] [807cefea] free_area_init_nodes+0x2da/0x320 [0.00] [807c9c97] paging_init+0x87/0x90 [0.00] [807c0f85] setup_arch+0x355/0x470 [0.00] [807bc967] start_kernel+0x57/0x330 [0.00] [807bc12d] _sinittext+0x12d/0x140 [0.00] [0.00] RIP memmap_init_zone+0xb5/0x130 (gdb) list *0x807cddb5 0x807cddb5 is in memmap_init_zone (include/linux/list.h:32). 27 #define LIST_HEAD(name) \ 28 struct list_head name = LIST_HEAD_INIT(name) 29 30 static inline void INIT_LIST_HEAD(struct list_head *list) 31 { 32 list-next = list; 33 list-prev = list; 34 } 35 36 /* I will test more tomorrow... Well That doesn't make a whole pile of sense unless the memory map is not present. Looking at your boot log, we see this gem This implies that page-lru is invalid. Which implies that the memory map is indeed not present. However, if we look at the code in detail we have actually already updated several fields in the struct page already. Particularly we have already updated the flags, _count, and _mapcount. It is when we touch lru which we blammo. All of the good entries are in the first 24 bytes of the struct page, lru is in the 8th 64bit word, or +64 bytes. Looking at the faulting address it is e2000310, ie the fault is 16 bytes into a page. So the first three elements of this struct page are in one PMD mapped page, and the lru the next. As this has SPARSEMEM_VMEMMAP enabled that implies that the vemmmap has not been filled out correctly. Looking at the x86_64 initialiser it appears that we have the same bug that Kame-san reported against the generic initialisers. At the end of this email is a proposed patch for this, could you apply that to a clean 2.6.23-rc1-mm2 tree and give it a test for me. I have boot tested this on our x86_64 boxes, but they happen to be sized and layed out to not trip this bug. Let me know if it fixes things up for you and I will push it upstream. If this patch does not fix it could you please get us a boot log at loglevel=8 of an unmodified 2.6.23-rc1-mm2 kernel, this should give sufficient debug on how the vmemmap is initialised. [0.00] 1: 524288 - 917488 [0.00] 1: 1048576 - 1179648 [...] -apw === 8 === vmemmap x86_64: ensure end of section memmap is initialised Similar to the generic initialisers, the x86_64 vmemmap initialisation may incorrectly skip the last page of a section if the section start is not aligned to the page. Where we have a section spanning the end of a PMD we will check the start of the section at A populating it. We will then move on 1 PMD page to C and find ourselves beyond the end of the section which ends at B we will complete without checking the second PMD page. | PMD | PMD | | SECTION
Re: [linux-usb-devel] 2.6.23-rc1-mm2
On Thu, 2 Aug 2007, Mariusz Kozlowski wrote: === uhci_hcd :00:0c.0: dma_pool_free buffer-32, 6b6b6b6b/6b6b6b6b (bad dma) Mariusz, I guess the patch below (which I have just added to my tree) fixes that, right? Thanks. Yes - that's correct. This patch fixes the bug. Thanks. Does it also fix the dma_pool_free error? Alan Stern - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-usb-devel] 2.6.23-rc1-mm2
On Thu, 2 Aug 2007, Alan Stern wrote: uhci_hcd :00:0c.0: dma_pool_free buffer-32, 6b6b6b6b/6b6b6b6b (bad dma) I guess the patch below (which I have just added to my tree) fixes that, right? Thanks. Yes - that's correct. This patch fixes the bug. Thanks. Does it also fix the dma_pool_free error? I believe it should -- caused by calling usb_buffer_free() with bogus dma_addr_t, as corresponding usbhid_device has been already kfree()d. -- Jiri Kosina - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-usb-devel] 2.6.23-rc1-mm2
=== uhci_hcd :00:0c.0: dma_pool_free buffer-32, 6b6b6b6b/6b6b6b6b (bad dma) Mariusz, I guess the patch below (which I have just added to my tree) fixes that, right? Thanks. Yes - that's correct. This patch fixes the bug. Thanks. Does it also fix the dma_pool_free error? Yes - it does. Regards, Mariusz - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: unionfs compile error ( Re: 2.6.23-rc1-mm2 )
In message [EMAIL PROTECTED], Josef Sipek writes: On Wed, Aug 01, 2007 at 10:22:07AM -0700, Andrew Morton wrote: On Wed, 01 Aug 2007 12:33:18 +0200 Gabriel C [EMAIL PROTECTED] wrote: fs/unionfs/file.c:147: error: 'file_fsync' undeclared here (not in a function) make[2]: *** [fs/unionfs/file.o] Error 1 make[1]: *** [fs/unionfs] Error 2 make: *** [fs] Error 2 make: *** Waiting for unfinished jobs ... Config can be found there - http://194.231.229.228/MM/config-auto-3 This, I assume: --- a/fs/unionfs/file.c~git-unionfs-fix-2 +++ a/fs/unionfs/file.c @@ -17,6 +17,7 @@ */ #include union.h +#include linux/buffer_head.h /*** * File Operations * _ (and no, sorry, I will not be complicit in that single-header-file-which-includes-the-whole-world junk). Ouch. I had a fix for this, and it managed to get lost in the pile of patches. I'll fix it up and push fix to kernel.org. Jeff, make sure you push my fix which will work even if CONFIG_BLOCK=n is set. Jeff. Erez. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2: Fix crash in sysfs_hash_and_remove
From: Rafael J. Wysocki [EMAIL PROTECTED] My test box crashes during suspend, while the nonboot CPUs are being disabled, because sysfs_hash_and_remove() doesn't check if dir_sd passed to it is not NULL. Fix it. Signed-off-by: Rafael J. Wysocki [EMAIL PROTECTED] --- fs/sysfs/inode.c |2 ++ 1 file changed, 2 insertions(+) Index: linux-2.6.23-rc1-mm2/fs/sysfs/inode.c === --- linux-2.6.23-rc1-mm2.orig/fs/sysfs/inode.c +++ linux-2.6.23-rc1-mm2/fs/sysfs/inode.c @@ -191,6 +191,8 @@ int sysfs_hash_and_remove(struct kobject struct sysfs_addrm_cxt acxt; struct sysfs_dirent **pos, *sd; + if (!dir_sd) + return -ENOENT; sysfs_addrm_start(acxt, dir_sd); if (!sysfs_resolve_for_remove(kobj, acxt.parent_sd)) goto addrm_finish; - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2: Fix crash in sysfs_hash_and_remove
Rafael J. Wysocki wrote: From: Rafael J. Wysocki [EMAIL PROTECTED] My test box crashes during suspend, while the nonboot CPUs are being disabled, because sysfs_hash_and_remove() doesn't check if dir_sd passed to it is not NULL. Fix it. Signed-off-by: Rafael J. Wysocki [EMAIL PROTECTED] It got broken when shadow support was added. The shadow support in -mm1 will be dropped and Eric is preparing a new version. So, this fix probably won't be necessary from -mm2. Thanks. -- tejun - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
On 8/2/07, Andy Whitcroft [EMAIL PROTECTED] wrote: vmemmap x86_64: ensure end of section memmap is initialised Similar to the generic initialisers, the x86_64 vmemmap initialisation may incorrectly skip the last page of a section if the section start is not aligned to the page. Where we have a section spanning the end of a PMD we will check the start of the section at A populating it. We will then move on 1 PMD page to C and find ourselves beyond the end of the section which ends at B we will complete without checking the second PMD page. | PMD | PMD | | SECTION | A B C We should round ourselves to the end of the PMD as we iterate. Signed-off-by: Andy Whitcroft [EMAIL PROTECTED] --- arch/x86_64/mm/init.c |9 + 1 files changed, 5 insertions(+), 4 deletions(-) diff --git a/arch/x86_64/mm/init.c b/arch/x86_64/mm/init.c index ac49df0..5d1ed03 100644 --- a/arch/x86_64/mm/init.c +++ b/arch/x86_64/mm/init.c @@ -792,9 +792,10 @@ int __meminit vmemmap_populate_pmd(pud_t *pud, unsigned long addr, unsigned long end, int node) { pmd_t *pmd; + unsigned long next; - for (pmd = pmd_offset(pud, addr); addr end; - pmd++, addr += PMD_SIZE) + for (pmd = pmd_offset(pud, addr); addr end; pmd++, addr = next) { + next = pmd_addr_end(addr, end); if (pmd_none(*pmd)) { pte_t entry; void *p = vmemmap_alloc_block(PMD_SIZE, node); @@ -808,8 +809,8 @@ int __meminit vmemmap_populate_pmd(pud_t *pud, unsigned long addr, printk(KERN_DEBUG [%lx-%lx] PMD -%p on node %d\n, addr, addr + PMD_SIZE - 1, p, node); } else - vmemmap_verify((pte_t *)pmd, node, - pmd_addr_end(addr, end), end); + vmemmap_verify((pte_t *)pmd, node, next, end); + } return 0; } #endif That patch applied to 2.6.23-rc1-mm2 boots. But I still the the MP-BIOS bug, now with an additional Call Trace: [ 27.034907] ACPI: Core revision 20070126 [ 27.082090] ..MP-BIOS bug: 8254 timer not connected to IO-APIC [ 27.132617] WARNING: at kernel/irq/resend.c:69 check_irq_resend() [ 27.150837] [ 27.150837] Call Trace: [ 27.162621] [80261c4c] check_irq_resend+0xbc/0xd0 [ 27.179558] [802617c0] enable_irq+0xf0/0x100 [ 27.195177] [807c6984] setup_IO_APIC+0x6c4/0x9a0 [ 27.211833] [80234e74] set_cpus_allowed+0x64/0xc0 [ 27.228749] [807c4e14] smp_prepare_cpus+0x434/0x460 [ 27.246183] [807bc627] kernel_init+0x67/0x350 [ 27.262062] [8020cac8] child_rip+0xa/0x12 [ 27.276928] [803d4f80] acpi_ds_init_one_object+0x0/0x7c [ 27.295425] [807bc5c0] kernel_init+0x0/0x350 [ 27.311043] [8020cabe] child_rip+0x0/0x12 [ 27.325881] [ 27.463199] Using local APIC timer interrupts. [ 27.514874] result 12500129 [ 27.523240] Detected 12.500 MHz APIC timer. It does no longer seem to matter if it was a warm or cold start. Otherwise the system seems to be working normal. Torsten - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2: Fix crash in sysfs_hash_and_remove
Tejun Heo [EMAIL PROTECTED] writes: Rafael J. Wysocki wrote: From: Rafael J. Wysocki [EMAIL PROTECTED] My test box crashes during suspend, while the nonboot CPUs are being disabled, because sysfs_hash_and_remove() doesn't check if dir_sd passed to it is not NULL. Fix it. Signed-off-by: Rafael J. Wysocki [EMAIL PROTECTED] It got broken when shadow support was added. The shadow support in -mm1 will be dropped and Eric is preparing a new version. So, this fix probably won't be necessary from -mm2. Agreed. That check is in my current development tree. Eric - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.23-rc1-mm2 + cpufreq patch -- inconsistent {in-hardirq-W} - {hardirq-on-W} usage.
I am running Ubuntu Gutsy with latest updates. When I run /etc/init.d/networking stop with my custom kernel, I get: = [ INFO: inconsistent lock state ] 2.6.23-rc1-mm2 #21 - inconsistent {in-hardirq-W} - {hardirq-on-W} usage. ifconfig/8982 [HC0[0]:SC0[0]:HE1:SE1] takes: (tp-lock){++..}, at: [f8b9d65a] rtl8139_interrupt+0x22/0x377 [8139too] {in-hardirq-W} state was registered at: [c0144837] __lock_acquire+0x430/0xbca [c0145047] lock_acquire+0x76/0x9d [c031ffe0] _spin_lock+0x23/0x32 [f8b9d65a] rtl8139_interrupt+0x22/0x377 [8139too] [c0154b44] handle_IRQ_event+0x1a/0x4f [c0155c6c] handle_fasteoi_irq+0x7d/0xb6 [c0109a5a] do_IRQ+0xaf/0xd9 [] 0x irq event stamp: 1501 hardirqs last enabled at (1501): [c0173232] kfree+0xc7/0xdb hardirqs last disabled at (1500): [c01731d2] kfree+0x67/0xdb softirqs last enabled at (1480): [c02cf082] dev_deactivate+0x87/0xa0 softirqs last disabled at (1478): [c031fffa] _spin_lock_bh+0xb/0x37 other info that might help us debug this: 1 lock held by ifconfig/8982: #0: (rtnl_mutex){--..}, at: [c031ed95] mutex_lock+0x1c/0x1f stack backtrace: [c01080ab] show_trace_log_lvl+0x12/0x25 [c0108a9e] show_trace+0xd/0x10 [c0108bac] dump_stack+0x16/0x18 [c0143374] print_usage_bug+0x107/0x114 [c0143bdd] mark_lock+0x1e9/0x400 [c01448ab] __lock_acquire+0x4a4/0xbca [c0145047] lock_acquire+0x76/0x9d [c031ffe0] _spin_lock+0x23/0x32 [f8b9d65a] rtl8139_interrupt+0x22/0x377 [8139too] [c015506f] free_irq+0xc9/0xf2 [f8b9ef56] rtl8139_close+0xac/0x14a [8139too] [c02c2135] dev_close+0x4e/0x6b [c02c128a] dev_change_flags+0x9f/0x152 [c03015f2] devinet_ioctl+0x209/0x506 [c0301f5a] inet_ioctl+0x86/0xa4 [c02b8211] sock_ioctl+0x1a9/0x1c7 [c017ffca] do_ioctl+0x22/0x67 [c0180258] vfs_ioctl+0x249/0x25c [c0180297] sys_ioctl+0x2c/0x45 [c0106f02] sysenter_past_esp+0x6b/0xb5 === eth2: link down - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
On 8/2/07, Mel Gorman <[EMAIL PROTECTED]> wrote: > On (01/08/07 22:52), Torsten Kaiser didst pronounce: > > Next try with 2.6.23-rc1-mm2 and SPARSEMEM: > > Probably the same exception, but this time with Call Trace: > > [0.00] Bootmem setup node 0 -8000 > > [0.00] Bootmem setup node 1 8000-00012000 > > [0.00] Zone PFN ranges: > > [0.00] DMA 0 -> 4096 > > [0.00] DMA324096 -> 1048576 > > [0.00] Normal1048576 -> 1179648 > > [0.00] Movable zone start PFN for each node > > [0.00] early_node_map[4] active PFN ranges > > [0.00] 0:0 -> 159 > > [0.00] 0: 256 -> 524288 > > [0.00] 1: 524288 -> 917488 > > [0.00] 1: 1048576 -> 1179648 > > PANIC: early exception rip 807cddb5 error 2 cr2 e2000310 > > [0.00] > > [0.00] Call Trace: > > [0.00] [] memmap_init_zone+0xb5/0x130 > > [0.00] [] init_currently_empty_zone+0x84/0x110 > > [0.00] [] free_area_init_node+0x393/0x3e0 > > [0.00] [] free_area_init_nodes+0x2da/0x320 > > [0.00] [] paging_init+0x87/0x90 > > [0.00] [] setup_arch+0x355/0x470 > > [0.00] [] start_kernel+0x57/0x330 > > [0.00] [] _sinittext+0x12d/0x140 > > [0.00] > > [0.00] RIP memmap_init_zone+0xb5/0x130 > > > > (gdb) list *0x807cddb5 > > 0x807cddb5 is in memmap_init_zone (include/linux/list.h:32). > > 27 #define LIST_HEAD(name) \ > > 28 struct list_head name = LIST_HEAD_INIT(name) > > 29 > > 30 static inline void INIT_LIST_HEAD(struct list_head *list) > > 31 { > > 32 list->next = list; > > 33 list->prev = list; > > 34 } > > 35 > > 36 /* > > > > I will test more tomorrow... > > Well That doesn't make a whole pile of sense unless the memory map > is not present. Looking at your boot log, we see this gem > > > [0.00] 1: 524288 -> 917488 > > [0.00] 1: 1048576 -> 1179648 Complete bootlog, if you need more info about the memmaps... [0.00] Linux version 2.6.23-rc1-mm2 ([EMAIL PROTECTED]) (gcc version 4.2.1 (Gentoo 4.2.1 p1.4)) #1 SMP Wed Aug 1 21:56:36 CEST 2007 [0.00] Command line: earlyprintk=serial,ttyS0,38400 console=ttyS0,38400 console=tty1 crypt_root=/dev/md1 [0.00] BIOS-provided physical RAM map: [0.00] BIOS-e820: - 0009fc00 (usable) [0.00] BIOS-e820: 0009fc00 - 000a (reserved) [0.00] BIOS-e820: 000e4000 - 0010 (reserved) [0.00] BIOS-e820: 0010 - dfff (usable) [0.00] BIOS-e820: dfff - dfffe000 (ACPI data) [0.00] BIOS-e820: dfffe000 - e000 (ACPI NVS) [0.00] BIOS-e820: fec0 - fec01000 (reserved) [0.00] BIOS-e820: fee0 - fef0 (reserved) [0.00] BIOS-e820: ff70 - 0001 (reserved) [0.00] BIOS-e820: 0001 - 00012000 (usable) [0.00] console [earlyser0] enabled [0.00] end_pfn_map = 1179648 kernel direct mapping tables up to 12000 @ 8000-e000 [0.00] DMI present. [0.00] ACPI: RSDP 000FB5E0, 0014 (r0 ACPIAM) [0.00] ACPI: RSDT DFFF, 003C (r1 A M I OEMRSDT 6000626 MSFT 97) [0.00] ACPI: FACP DFFF0200, 0084 (r2 A M I OEMFACP 6000626 MSFT 97) [0.00] ACPI: DSDT DFFF0450, 48E1 (r1 S0027 S00270000 INTL 20051117) [0.00] ACPI: FACS DFFFE000, 0040 [0.00] ACPI: APIC DFFF0390, 0080 (r1 A M I OEMAPIC 6000626 MSFT 97) [0.00] ACPI: MCFG DFFF0410, 003C (r1 A M I OEMMCFG 6000626 MSFT 97) [0.00] ACPI: OEMB DFFFE040, 0060 (r1 A M I AMI_OEM 6000626 MSFT 97) [0.00] ACPI: SRAT DFFF4D40, 0110 (r1 AMDHAMMER 1 AMD 1) [0.00] ACPI: SSDT DFFF4E50, 04F0 (r1 A M I ACPI2PPC1 AMI 1) [0.00] SRAT: PXM 0 -> APIC 0 -> Node 0 [0.00] SRAT: PXM 0 -> APIC 1 -> Node 0 [0.00] SRAT: PXM 1 -> APIC 2 -> Node 1 [0.00] SRAT: PXM 1 -> APIC 3 -> Node 1 [0.00] SRAT: Node 0 PXM 0 0-a [0.00] SRAT: Node 0 PXM 0 0-8000 [0.00] SRAT: Node 1 PXM 1 8000-e000 [0.00] SRAT: Node 1 PXM 1 8000-12000 [0.00] Bootmem setup node 0 -8000 [0.00] Bootmem s
Re: 2.6.23-rc1-mm2
On (01/08/07 22:52), Torsten Kaiser didst pronounce: > On 8/1/07, Andrew Morton <[EMAIL PROTECTED]> wrote: > > On Wed, 01 Aug 2007 16:30:08 -0400 > > [EMAIL PROTECTED] wrote: > > > > > As an aside, it looks like bits of dynticks-for-x86_64 are in > > > there. > > > In particular, x86_64-enable-high-resolution-timers-and-dynticks.patch is > > > in > > > there, adding a menu that depends on GENERIC_CLOCKEVENTS, but then nothing > > > in the x86_64 tree actually *sets* it. There's a few other > > > dynticks-related > > > prep patches in there as well. Does this mean it's back to "coming soon > > > to > > > a CPU near you" status? :) > > > > I've lost the plot on that stuff: I'm just leaving things as-is for now, > > wait for Thomas to return from vacation so we can have another run at it. > > For what its worth: 2.6.22-rc6-mm1 with NO_HZ works for me on an AMD > SMP system without trouble. > > Next try with 2.6.23-rc1-mm2 and SPARSEMEM: > Probably the same exception, but this time with Call Trace: > [0.00] Bootmem setup node 0 -8000 > [0.00] Bootmem setup node 1 8000-00012000 > [0.00] Zone PFN ranges: > [0.00] DMA 0 -> 4096 > [0.00] DMA324096 -> 1048576 > [0.00] Normal1048576 -> 1179648 > [0.00] Movable zone start PFN for each node > [0.00] early_node_map[4] active PFN ranges > [0.00] 0:0 -> 159 > [0.00] 0: 256 -> 524288 > [0.00] 1: 524288 -> 917488 > [0.00] 1: 1048576 -> 1179648 > PANIC: early exception rip 807cddb5 error 2 cr2 e2000310 > [0.00] > [0.00] Call Trace: > [0.00] [] memmap_init_zone+0xb5/0x130 > [0.00] [] init_currently_empty_zone+0x84/0x110 > [0.00] [] free_area_init_node+0x393/0x3e0 > [0.00] [] free_area_init_nodes+0x2da/0x320 > [0.00] [] paging_init+0x87/0x90 > [0.00] [] setup_arch+0x355/0x470 > [0.00] [] start_kernel+0x57/0x330 > [0.00] [] _sinittext+0x12d/0x140 > [0.00] > [0.00] RIP memmap_init_zone+0xb5/0x130 > > (gdb) list *0x807cddb5 > 0x807cddb5 is in memmap_init_zone (include/linux/list.h:32). > 27 #define LIST_HEAD(name) \ > 28 struct list_head name = LIST_HEAD_INIT(name) > 29 > 30 static inline void INIT_LIST_HEAD(struct list_head *list) > 31 { > 32 list->next = list; > 33 list->prev = list; > 34 } > 35 > 36 /* > > I will test more tomorrow... Well That doesn't make a whole pile of sense unless the memory map is not present. Looking at your boot log, we see this gem > [0.00] 1: 524288 -> 917488 > [0.00] 1: 1048576 -> 1179648 Node 1 spans a region with a nice little hole in the middle of DMA32. In our test machines, we wouldn't see a hole like this, at least that I can recall so it would appear to work on some machines. On SPARSEMEM, sparse_init() is responsible for allocating memmap for each section. In 2.6.22-rc6-mm1, it allocated the memory if the section was *valid*. In 2.6.23-rc1-mm1, it allocates the memory if the section is *present* due to the patch sparsemem-record-when-a-section-has-a-valid-mem_map.patch[1]. Much later in the init process, memmap is initialised based on spanned memory, not present memory so initialisation will init memmap that resides in holes if a zone spans that area in a node which is the case on this machine. I think this is why it kablamos - it's inits memmap that wasn't allocated because it's not present and the suprise is that it doesn't blow up sooner. Please try the patch below Torsten, thanks. [1] yeah, I acked this patch and I had read through it. My bad if the patch below does fix the problem diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.23-rc1-mm2-clean/mm/sparse.c linux-2.6.23-rc1-mm2-present_revert/mm/sparse.c --- linux-2.6.23-rc1-mm2-clean/mm/sparse.c 2007-08-01 10:09:39.0 +0100 +++ linux-2.6.23-rc1-mm2-present_revert/mm/sparse.c 2007-08-02 00:27:00.0 +0100 @@ -483,7 +483,7 @@ void __init sparse_init(void) unsigned long *usemap; for (pnum = 0; pnum < NR_MEM_SECTIONS; pnum++) { - if (!present_section_nr(pnum)) + if (!valid_section_nr(pnum)) continue; map = sparse_early_mem_map_alloc(pnum); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
On Wed, 1 Aug 2007 22:52:44 +0200 "Torsten Kaiser" <[EMAIL PROTECTED]> wrote: > On 8/1/07, Andrew Morton <[EMAIL PROTECTED]> wrote: > > On Wed, 01 Aug 2007 16:30:08 -0400 > > [EMAIL PROTECTED] wrote: > > > > > As an aside, it looks like bits of dynticks-for-x86_64 are in > > > there. > > > In particular, x86_64-enable-high-resolution-timers-and-dynticks.patch is > > > in > > > there, adding a menu that depends on GENERIC_CLOCKEVENTS, but then nothing > > > in the x86_64 tree actually *sets* it. There's a few other > > > dynticks-related > > > prep patches in there as well. Does this mean it's back to "coming soon > > > to > > > a CPU near you" status? :) > > > > I've lost the plot on that stuff: I'm just leaving things as-is for now, > > wait for Thomas to return from vacation so we can have another run at it. > > For what its worth: 2.6.22-rc6-mm1 with NO_HZ works for me on an AMD > SMP system without trouble. > > Next try with 2.6.23-rc1-mm2 and SPARSEMEM: > Probably the same exception, but this time with Call Trace: > [0.00] Bootmem setup node 0 -8000 > [0.00] Bootmem setup node 1 8000-00012000 > [0.00] Zone PFN ranges: > [0.00] DMA 0 -> 4096 > [0.00] DMA324096 -> 1048576 > [0.00] Normal1048576 -> 1179648 > [0.00] Movable zone start PFN for each node > [0.00] early_node_map[4] active PFN ranges > [0.00] 0:0 -> 159 > [0.00] 0: 256 -> 524288 > [0.00] 1: 524288 -> 917488 > [0.00] 1: 1048576 -> 1179648 > PANIC: early exception rip 807cddb5 error 2 cr2 e2000310 It's cryptically telling us that the code tried to access 0xe2000310 > [0.00] > [0.00] Call Trace: > [0.00] [] memmap_init_zone+0xb5/0x130 > [0.00] [] init_currently_empty_zone+0x84/0x110 > [0.00] [] free_area_init_node+0x393/0x3e0 > [0.00] [] free_area_init_nodes+0x2da/0x320 > [0.00] [] paging_init+0x87/0x90 > [0.00] [] setup_arch+0x355/0x470 > [0.00] [] start_kernel+0x57/0x330 > [0.00] [] _sinittext+0x12d/0x140 > [0.00] > [0.00] RIP memmap_init_zone+0xb5/0x130 > > (gdb) list *0x807cddb5 > 0x807cddb5 is in memmap_init_zone (include/linux/list.h:32). > 27 #define LIST_HEAD(name) \ > 28 struct list_head name = LIST_HEAD_INIT(name) > 29 > 30 static inline void INIT_LIST_HEAD(struct list_head *list) > 31 { > 32 list->next = list; > 33 list->prev = list; > 34 } > 35 > 36 /* > > I will test more tomorrow... Thanks. Please send the .config? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
On 8/1/07, Andrew Morton <[EMAIL PROTECTED]> wrote: > On Wed, 01 Aug 2007 16:30:08 -0400 > [EMAIL PROTECTED] wrote: > > > As an aside, it looks like bits of dynticks-for-x86_64 are in there. > > In particular, x86_64-enable-high-resolution-timers-and-dynticks.patch is in > > there, adding a menu that depends on GENERIC_CLOCKEVENTS, but then nothing > > in the x86_64 tree actually *sets* it. There's a few other dynticks-related > > prep patches in there as well. Does this mean it's back to "coming soon to > > a CPU near you" status? :) > > I've lost the plot on that stuff: I'm just leaving things as-is for now, > wait for Thomas to return from vacation so we can have another run at it. For what its worth: 2.6.22-rc6-mm1 with NO_HZ works for me on an AMD SMP system without trouble. Next try with 2.6.23-rc1-mm2 and SPARSEMEM: Probably the same exception, but this time with Call Trace: [0.00] Bootmem setup node 0 -8000 [0.00] Bootmem setup node 1 8000-00012000 [0.00] Zone PFN ranges: [0.00] DMA 0 -> 4096 [0.00] DMA324096 -> 1048576 [0.00] Normal1048576 -> 1179648 [0.00] Movable zone start PFN for each node [0.00] early_node_map[4] active PFN ranges [0.00] 0:0 -> 159 [0.00] 0: 256 -> 524288 [0.00] 1: 524288 -> 917488 [0.00] 1: 1048576 -> 1179648 PANIC: early exception rip 807cddb5 error 2 cr2 e2000310 [0.00] [0.00] Call Trace: [0.00] [] memmap_init_zone+0xb5/0x130 [0.00] [] init_currently_empty_zone+0x84/0x110 [0.00] [] free_area_init_node+0x393/0x3e0 [0.00] [] free_area_init_nodes+0x2da/0x320 [0.00] [] paging_init+0x87/0x90 [0.00] [] setup_arch+0x355/0x470 [0.00] [] start_kernel+0x57/0x330 [0.00] [] _sinittext+0x12d/0x140 [0.00] [0.00] RIP memmap_init_zone+0xb5/0x130 (gdb) list *0x807cddb5 0x807cddb5 is in memmap_init_zone (include/linux/list.h:32). 27 #define LIST_HEAD(name) \ 28 struct list_head name = LIST_HEAD_INIT(name) 29 30 static inline void INIT_LIST_HEAD(struct list_head *list) 31 { 32 list->next = list; 33 list->prev = list; 34 } 35 36 /* I will test more tomorrow... Torsten - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
On Wed, 01 Aug 2007 16:30:08 -0400 [EMAIL PROTECTED] wrote: > As an aside, it looks like bits of dynticks-for-x86_64 are in there. > In particular, x86_64-enable-high-resolution-timers-and-dynticks.patch is in > there, adding a menu that depends on GENERIC_CLOCKEVENTS, but then nothing > in the x86_64 tree actually *sets* it. There's a few other dynticks-related > prep patches in there as well. Does this mean it's back to "coming soon to > a CPU near you" status? :) I've lost the plot on that stuff: I'm just leaving things as-is for now, wait for Thomas to return from vacation so we can have another run at it. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
On Tue, 31 Jul 2007 23:09:32 PDT, Andrew Morton said: > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm2/ Builds, boots, runs here. Dell Latitude D820, Core2 Duo T7200, x86_64 kernel. > -loop-use-unlocked_ioctl.patch > > Dropped, broken. Fixes one issue I had in -mm1 (I'm assuming somebody else spotting this one as well, you dropped it before I reported it.. :) > +tpm_tis-fix-interrupt-probing.patch And the other... As an aside, it looks like bits of dynticks-for-x86_64 are in there. In particular, x86_64-enable-high-resolution-timers-and-dynticks.patch is in there, adding a menu that depends on GENERIC_CLOCKEVENTS, but then nothing in the x86_64 tree actually *sets* it. There's a few other dynticks-related prep patches in there as well. Does this mean it's back to "coming soon to a CPU near you" status? :) pgpyIH9w2b43S.pgp Description: PGP signature
Re: 2.6.23-rc1-mm2 (checks-for-80wire-cable-use-in-pata_via)
Le 01.08.2007 08:09, Andrew Morton a écrit : > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm2/ ... > +libata-acpi-checks-for-80wire-cable-use-in-pata_via.patch ... > sata/pata things Alan, this does not work after a suspend-resume cycle, I get a " ACPI get timing mode failed (AE 0x1001)" error. $ dmesg | grep ata ... scsi0 : pata_via scsi1 : pata_via ata1: PATA max UDMA/100 cmd 0x000101f0 ctl 0x000103f6 bmdma 0x0001b800 irq 14 ata2: PATA max UDMA/100 cmd 0x00010170 ctl 0x00010376 bmdma 0x0001b808 irq 15 ata1.00: ATA-5: ST340016A, 3.75, max UDMA/100 ata1.00: 78165360 sectors, multi 16: LBA ata1.01: ATA-7: Maxtor 6Y080L0, YAR41BW0, max UDMA/133 ata1.01: 160086528 sectors, multi 16: LBA ata1.00: configured for UDMA/100 ata1.01: configured for UDMA/100 ata2.00: ATAPI: HL-DT-ST DVDRAM GSA-4165B, DL03, max UDMA/33 ata2.01: ATAPI: CD-950E/AKU, A4Q, max MWDMA2, CDB intr ata2.00: configured for UDMA/33 ata2.01: configured for MWDMA2 ata1.00: Unable to set Link PM policy ata1.01: Unable to set Link PM policy ata2.00: Unable to set Link PM policy ata2.01: Unable to set Link PM policy ... [ suspend-to-disk/resume cycle happens here ] ... ata1.00: Unable to set Link PM policy ata1.01: Unable to set Link PM policy ata2.00: Unable to set Link PM policy ata2.01: Unable to set Link PM policy ata1: ACPI get timing mode failed (AE 0x1001) <== ata1.00: limited to UDMA/33 due to 40-wire cable ata1.01: limited to UDMA/33 due to 40-wire cable ata1.00: configured for UDMA/33 ata1.01: configured for UDMA/33 ata2: ACPI get timing mode failed (AE 0x1001) ata2.00: configured for UDMA/33 ata2.01: configured for MWDMA2 Anyway, long before 2.6.23-rc1-mm2, 80-wire cable detection was already wrong after a suspend-resume cycle. So I cooked the following patch 2 days ago. It may be the wrong approach but it works for me. -- pata_via: preserve cable detection bits in via_do_set_mode via_cable_detect performs cable detection by checking bits in PCI layer. But via_do_set_mode overwrites these bits. This behaviour breaks cable detection after suspend/resume cycle. So let's teach via_do_set_mode to preserve cable detection bits. Signed-off-by: Laurent Riffard <[EMAIL PROTECTED]> --- drivers/ata/pata_via.c |7 +++ 1 file changed, 7 insertions(+) Index: linux-2.6-mm/drivers/ata/pata_via.c === --- linux-2.6-mm.orig/drivers/ata/pata_via.c +++ linux-2.6-mm/drivers/ata/pata_via.c @@ -238,6 +238,7 @@ static void via_do_set_mode(struct ata_p unsigned long T = 10 / via_clock; unsigned long UT = T/tdiv; int ut; + u8 cable80_status; int offset = 3 - (2*ap->port_no) - adev->devno; @@ -287,6 +288,12 @@ static void via_do_set_mode(struct ata_p ut = t.udma ? (0xe0 | (FIT(t.udma, 2, 9) - 2)) : 0x07; break; } + + /* Preserve cable detection bit */ + pci_read_config_byte(pdev, 0x50 + offset, _status); + cable80_status &= 0x10; + ut |= cable80_status; + /* Set UDMA unless device is not UDMA capable */ if (udma_type) pci_write_config_byte(pdev, 0x50 + offset, ut); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
Andrew Morton wrote: > On Wed, 01 Aug 2007 12:56:09 +0200 Gabriel C <[EMAIL PROTECTED]> wrote: > >> Andrew Morton wrote: >> ... >>> - git-wireless is back. It is still a >3MB diff, and appears to compile. >>> >> ... >> >> allmodconfig on UML >> >> ... >> >> >> In file included from >> drivers/net/wireless/bcm43xx-mac80211/bcm43xx_main.c:48: >> drivers/net/wireless/bcm43xx-mac80211/bcm43xx_pio.h: In function >> 'bcm43xx_pio_write': >> drivers/net/wireless/bcm43xx-mac80211/bcm43xx_pio.h:97: error: implicit >> declaration of function 'mmiowb' >> drivers/net/wireless/bcm43xx-mac80211/bcm43xx_main.c: In function >> 'bcm43xx_init': >> drivers/net/wireless/bcm43xx-mac80211/bcm43xx_main.c:4051: warning: label >> 'err_dfs_exit' defined but not used >> make[4]: *** [drivers/net/wireless/bcm43xx-mac80211/bcm43xx_main.o] Error 1 >> make[3]: *** [drivers/net/wireless/bcm43xx-mac80211] Error 2 >> make[2]: *** [drivers/net/wireless] Error 2 >> make[1]: *** [drivers/net] Error 2 >> make: *** [drivers] Error 2 >> make: *** Waiting for unfinished jobs >> > > (cc linux-wireless) > > Probably Kconfig troubles again. > Maybe something from SSB ? ... scripts/kconfig/conf -s arch/um/Kconfig net/bluetooth/hidp/Kconfig:4:warning: 'select' used by config symbol 'BT_HIDP' refers to undefined symbol 'HID' drivers/net/Kconfig:1456:warning: 'select' used by config symbol 'B44_PCI' refers to undefined symbol 'SSB_PCIHOST' drivers/net/Kconfig:1457:warning: 'select' used by config symbol 'B44_PCI' refers to undefined symbol 'SSB_DRIVER_PCICORE' drivers/net/Kconfig:1437:warning: 'select' used by config symbol 'B44' refers to undefined symbol 'SSB' drivers/net/Kconfig:2112:warning: 'select' used by config symbol 'R8169' refers to undefined symbol 'EEPROM_93CX6' drivers/net/wireless/Kconfig:552:warning: 'select' used by config symbol 'RTL8187' refers to undefined symbol 'EEPROM_93CX6' drivers/net/wireless/Kconfig:637:warning: 'select' used by config symbol 'RT2X00_LIB_RFKILL' refers to undefined symbol 'INPUT_POLLDEV' drivers/net/wireless/Kconfig:643:warning: 'select' used by config symbol 'RT2400PCI' refers to undefined symbol 'EEPROM_93CX6' drivers/net/wireless/Kconfig:662:warning: 'select' used by config symbol 'RT2500PCI' refers to undefined symbol 'EEPROM_93CX6' drivers/net/wireless/Kconfig:682:warning: 'select' used by config symbol 'RT61PCI' refers to undefined symbol 'EEPROM_93CX6' drivers/net/wireless/bcm43xx-mac80211/Kconfig:14:warning: 'select' used by config symbol 'BCM43XX_MAC80211_PCI' refers to undefined symbol 'SSB_PCIHOST' drivers/net/wireless/bcm43xx-mac80211/Kconfig:15:warning: 'select' used by config symbol 'BCM43XX_MAC80211_PCI' refers to undefined symbol 'SSB_DRIVER_PCICORE' drivers/net/wireless/bcm43xx-mac80211/Kconfig:28:warning: 'select' used by config symbol 'BCM43XX_MAC80211_PCMCIA' refers to undefined symbol 'SSB_PCMCIAHOST' drivers/net/wireless/bcm43xx-mac80211/Kconfig:48:warning: 'select' used by config symbol 'BCM43XX_MAC80211_DEBUG' refers to undefined symbol 'SSB_DEBUG' drivers/net/wireless/bcm43xx-mac80211/Kconfig:5:warning: 'select' used by config symbol 'BCM43XX_MAC80211' refers to undefined symbol 'SSB' lib/Kconfig.kgdb:71:warning: 'select' used by config symbol 'KGDB_8250_NOMODULE' refers to undefined symbol 'SERIAL_8250' ... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: unionfs compile error ( Re: 2.6.23-rc1-mm2 )
Andrew Morton wrote: > On Wed, 01 Aug 2007 12:33:18 +0200 Gabriel C <[EMAIL PROTECTED]> wrote: > >> >> >> fs/unionfs/file.c:147: error: 'file_fsync' undeclared here (not in a >> function) >> make[2]: *** [fs/unionfs/file.o] Error 1 >> make[1]: *** [fs/unionfs] Error 2 >> make: *** [fs] Error 2 >> make: *** Waiting for unfinished jobs >> >> ... >> >> Config can be found there -> http://194.231.229.228/MM/config-auto-3 >> > > This, I assume: Yes this fixes it. > > --- a/fs/unionfs/file.c~git-unionfs-fix-2 > +++ a/fs/unionfs/file.c > @@ -17,6 +17,7 @@ > */ > > #include "union.h" > +#include > > /*** > * File Operations * > _ > > (and no, sorry, I will not be complicit in that > single-header-file-which-includes-the-whole-world junk). > > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
On Wed, 01 Aug 2007 12:56:09 +0200 Gabriel C <[EMAIL PROTECTED]> wrote: > Andrew Morton wrote: > ... > > - git-wireless is back. It is still a >3MB diff, and appears to compile. > > > ... > > allmodconfig on UML > > ... > > > In file included from drivers/net/wireless/bcm43xx-mac80211/bcm43xx_main.c:48: > drivers/net/wireless/bcm43xx-mac80211/bcm43xx_pio.h: In function > 'bcm43xx_pio_write': > drivers/net/wireless/bcm43xx-mac80211/bcm43xx_pio.h:97: error: implicit > declaration of function 'mmiowb' > drivers/net/wireless/bcm43xx-mac80211/bcm43xx_main.c: In function > 'bcm43xx_init': > drivers/net/wireless/bcm43xx-mac80211/bcm43xx_main.c:4051: warning: label > 'err_dfs_exit' defined but not used > make[4]: *** [drivers/net/wireless/bcm43xx-mac80211/bcm43xx_main.o] Error 1 > make[3]: *** [drivers/net/wireless/bcm43xx-mac80211] Error 2 > make[2]: *** [drivers/net/wireless] Error 2 > make[1]: *** [drivers/net] Error 2 > make: *** [drivers] Error 2 > make: *** Waiting for unfinished jobs > (cc linux-wireless) Probably Kconfig troubles again. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: unionfs compile error ( Re: 2.6.23-rc1-mm2 )
On Wed, Aug 01, 2007 at 10:22:07AM -0700, Andrew Morton wrote: > On Wed, 01 Aug 2007 12:33:18 +0200 Gabriel C <[EMAIL PROTECTED]> wrote: > > > > > > > fs/unionfs/file.c:147: error: 'file_fsync' undeclared here (not in a > > function) > > make[2]: *** [fs/unionfs/file.o] Error 1 > > make[1]: *** [fs/unionfs] Error 2 > > make: *** [fs] Error 2 > > make: *** Waiting for unfinished jobs > > > > ... > > > > Config can be found there -> http://194.231.229.228/MM/config-auto-3 > > > > This, I assume: > > --- a/fs/unionfs/file.c~git-unionfs-fix-2 > +++ a/fs/unionfs/file.c > @@ -17,6 +17,7 @@ > */ > > #include "union.h" > +#include > > /*** > * File Operations * > _ > > (and no, sorry, I will not be complicit in that > single-header-file-which-includes-the-whole-world junk). Ouch. I had a fix for this, and it managed to get lost in the pile of patches. I'll fix it up and push fix to kernel.org. Jeff. -- I abhor a system designed for the "user", if that word is a coded pejorative meaning "stupid and unsophisticated." - Ken Thompson - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: unionfs compile error ( Re: 2.6.23-rc1-mm2 )
On Wed, 01 Aug 2007 12:33:18 +0200 Gabriel C <[EMAIL PROTECTED]> wrote: > > > fs/unionfs/file.c:147: error: 'file_fsync' undeclared here (not in a function) > make[2]: *** [fs/unionfs/file.o] Error 1 > make[1]: *** [fs/unionfs] Error 2 > make: *** [fs] Error 2 > make: *** Waiting for unfinished jobs > > ... > > Config can be found there -> http://194.231.229.228/MM/config-auto-3 > This, I assume: --- a/fs/unionfs/file.c~git-unionfs-fix-2 +++ a/fs/unionfs/file.c @@ -17,6 +17,7 @@ */ #include "union.h" +#include /*** * File Operations * _ (and no, sorry, I will not be complicit in that single-header-file-which-includes-the-whole-world junk). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: build error on powerpc for 2.6.23-rc1-mm2
On Wed, 1 Aug 2007 13:52:05 +0530 Rishikesh K Rajak <[EMAIL PROTECTED]> wrote: > > Hi Andrew, > > I am getting following error on powerpc almost for allyesconfig, > allmodconfig. > > Error produced: > > PowerPC: allmodconfig,allyesconfig > > CC arch/powerpc/mm/tlb_64.o > /home/risrajak/TEST2/TEST/linux/CurrentTest/23-rc1-mm2-testing/linux-2.6.23-rc1/arch/powerpc/lib/copyuser_64.S: > Assembler messages: > /home/risrajak/TEST2/TEST/linux/CurrentTest/23-rc1-mm2-testing/linux-2.6.23-rc1/arch/powerpc/lib/copyuser_64.S:27: > Error: Unrecognized opcode: `mtocrf' > /home/risrajak/TEST2/TEST/linux/CurrentTest/23-rc1-mm2-testing/linux-2.6.23-rc1/arch/powerpc/lib/copyuser_64.S:138: > Error: Unrecognized opcode: `mtocrf' > /home/risrajak/TEST2/TEST/linux/CurrentTest/23-rc1-mm2-testing/linux-2.6.23-rc1/arch/powerpc/lib/copyuser_64.S:153: > Error: Unrecognized opcode: `mtocrf' > /home/risrajak/TEST2/TEST/linux/CurrentTest/23-rc1-mm2-testing/linux-2.6.23-rc1/arch/powerpc/lib/memcpy_64.S: > Assembler messages: > /home/risrajak/TEST2/TEST/linux/CurrentTest/23-rc1-mm2-testing/linux-2.6.23-rc1/arch/powerpc/lib/memcpy_64.S:15: > Error: Unrecognized opcode: `mtocrf' > /home/risrajak/TEST2/TEST/linux/CurrentTest/23-rc1-mm2-testing/linux-2.6.23-rc1/arch/powerpc/lib/memcpy_64.S:131: > Error: Unrecognized opcode: `mtocrf' > /home/risrajak/TEST2/TEST/linux/CurrentTest/23-rc1-mm2-testing/linux-2.6.23-rc1/arch/powerpc/lib/memcpy_64.S:146: > Error: Unrecognized opcode: `mtocrf' > make[2]: *** [arch/powerpc/lib/memcpy_64.o] Error 1 > make[2]: *** Waiting for unfinished jobs > CC arch/powerpc/sysdev/mpic_u3msi.o > make[2]: *** [arch/powerpc/lib/copyuser_64.o] Error 1 > make[1]: *** [arch/powerpc/lib] Error 2 > Presumably, disabling CONFIG_POWER4_ONLY would fix that. It could be that your toolchain is insufficiently recent. Which version of binutils are you using? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
On Wed, Aug 01, 2007 at 10:02:30AM +0200, Mariusz Kozlowski wrote: > Then reattaching a usb mouse caused this (only once) > > usb 2-1: USB disconnect, address 2 > BUG: atomic counter underflow at: > [] show_trace_log_lvl+0x1a/0x30 > [] show_trace+0x12/0x14 > [] dump_stack+0x15/0x17 > [] __free_pages+0x50/0x52 > [] free_pages+0x1f/0x21 > [] dma_free_coherent+0x43/0x9c > [] hcd_buffer_free+0x43/0x6a > [] usb_buffer_free+0x23/0x29 > [] hid_free_buffers+0x23/0x71 > [] hid_disconnect+0xb0/0xc8 > [] usb_unbind_interface+0x30/0x72 > [] __device_release_driver+0x6a/0x92 > [] device_release_driver+0x20/0x36 > [] bus_remove_device+0x62/0x85 > [] device_del+0x16d/0x27c > [] usb_disable_device+0x7a/0xe2 > [] usb_disconnect+0x94/0xde > [] hub_thread+0x2fe/0xc1b > [] kthread+0x36/0x58 > [] kernel_thread_helper+0x7/0x14 > === > uhci_hcd :00:0c.0: dma_pool_free buffer-32, 6b6b6b6b/6b6b6b6b (bad dma) > > Every new try shows: > > usb 2-1: new low speed USB device using uhci_hcd and address 3 > usb 2-1: new device found, idVendor=046d, idProduct=c00e > usb 2-1: new device strings: Mfr=1, Product=2, SerialNumber=0 > usb 2-1: Product: USB-PS/2 Optical Mouse > usb 2-1: Manufacturer: Logitech > usb 2-1: configuration #1 chosen from 1 choice > input: Logitech USB-PS/2 Optical Mouse as /class/input/input9 > input: USB HID v1.10 Mouse [Logitech USB-PS/2 Optical Mouse] on > usb-:00:0c.0-1 > usb 2-1: USB disconnect, address 3 > uhci_hcd :00:0c.0: dma_pool_free buffer-32, 6b6b6b6b/6b6b6b6b (bad dma) > > But mouse works ok. Can you see if the patch posted by Jiri fixes this or not? thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
Christoph Hellwig wrote: On Wed, Aug 01, 2007 at 01:10:33AM -0700, Andrew Morton wrote: I was hoping for a 2.6.24 merge. But I haven't actually looked at it yet. Hopefully Jason is planning to get it all out for review soonish. The current version is quite messy. I'd be much happier if we could start with a light version that doesn't have all the intrusions to random code outside the kgdb core. I would disagree on at least one level. The KGDB tree is broken up into incremental units each layer adding more functionality and or arch specific pieces. As an example, the KGDB core itself is: http://git.kernel.org/?p=linux/kernel/git/jwessel/linux-2.6-kgdb.git;a=commit;h=53956620b1b293300c5ae99a783cf6a7ce8175f9 If you can point to some specific examples vs a blanket statement "is quite messy" perhaps I can explain what the changes are for and why they are needed. Jason. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
On Wed, Aug 01, 2007 at 01:10:33AM -0700, Andrew Morton wrote: > I was hoping for a 2.6.24 merge. But I haven't actually looked at it yet. > Hopefully Jason is planning to get it all out for review soonish. The current version is quite messy. I'd be much happier if we could start with a light version that doesn't have all the intrusions to random code outside the kgdb core. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
drivers/scsi/advansys.c compile error ( Re: 2.6.23-rc1-mm2 )
Getting this with a randconfig ( http://194.231.229.228/MM/randconfig-auto-10 ) ... drivers/scsi/advansys.c:794:2: warning: #warning this driver is still not properly converted to the DMA API drivers/scsi/advansys.c: In function 'advansys_board_found': drivers/scsi/advansys.c:17781: error: implicit declaration of function 'to_pci_dev' drivers/scsi/advansys.c:17781: warning: pointer/integer type mismatch in conditional expression drivers/scsi/advansys.c:17788: warning: unused variable 'pci_memory_address' drivers/scsi/advansys.c:17781: warning: unused variable 'pdev' make[2]: *** [drivers/scsi/advansys.o] Error 1 make[1]: *** [drivers/scsi] Error 2 make[1]: *** Waiting for unfinished jobs ... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc1-mm2
Mariusz Kozlowski writes: > Second issue as reported earilier allmodconfig fails to build on imac g3. Do you really mean g3? If so it's a 32-bit kernel and it shouldn't be building lparmap.s. Or do you mean G5? > CC arch/powerpc/kernel/lparmap.s > AS arch/powerpc/kernel/head_64.o > lparmap.c: Assembler messages: > lparmap.c:84: Error: file number 1 already allocated > make[1]: *** [arch/powerpc/kernel/head_64.o] Blad 1 > make: *** [arch/powerpc/kernel] Blad 2 Weird. Could you do make V=1 and send me the output? Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/