Re: 2.6.24-rc4-mm1

2007-12-11 Thread Reuben Farrelly



On 11/12/2007 8:11 AM, Andrew Morton wrote:

On Tue, 11 Dec 2007 01:48:39 +1100
Reuben Farrelly <[EMAIL PROTECTED]> wrote:



On 5/12/2007 4:17 PM, Andrew Morton wrote:

Temporarily at

  http://userweb.kernel.org/~akpm/2.6.24-rc4-mm1/

Will appear later at

  
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc4/2.6.24-rc4-mm1/


- Lots of device IDs have been removed from the e1000 driver and moved over
  to e1000e.  So if your e1000 stops working, you forgot to set CONFIG_E1000E.

- The s390 build is still broken.

I'm seeing this most incredibly unhelpful (to debug) but fortunately
reproduceable problem (so far 4/4 times) on this -mm kernel.  I thought this 
problem may have been related to another bug which I have reported (A TCP oops) 
but even after applying a likely fix for that I am still seeing this problem.


The machine boots up perfectly fine and runs good until I load it up.
In this case I can reliably cause this to occur by pulling a 3G ISO across the
GigE network from my Linux box to my PC.  After maybe 50M or so, the console 
just displays this (ignore initial boot banner):


--

  * Starting local ... [ ok 
]


This is tornado.reub.net (Linux x86_64 2.6.24-rc4-mm1) 00:24:01

tornado login: *** buffer overf

---

Yes - after displaying the 'f' in what I can only guess is the word 'overflow',
the box spontaneously reboots.  There is no further console output until it 
starts to come back up again.


The problem does not exist in 2.6.23-gentoo kernels nor in a vanilla 
2.6.24-rc4-git6 (phew!), so this looks to be an -mm only problem at this stage.


I enabled a number of kernel debugging options but then I got no output at all 
when the machine crashed.


I'm at a bit of a loss as to which subsystem this might be coming from, so I'm 
not sure who to CC.


Box information is (still) up at 
http://www.reub.net/files/kernel/2.6.24-rc4-mm1/



hm.  grepping around for "buffer overflow" doesn't turn up anything except in
drivers which you won't be using on that machine.

I'd be suspecting networking, obviously.  If you're feeling keen could you 
please
grep a 2.6.24-rc4 tree and apply 2.6.24-rc4-mm1's origin.patch and git-net.patch
and see if the bug is still present?


No - seems to be fine with just origin.patch and git-net.patch.

Just for good measure I then reverted git-net.patch and applied 
git-netdev-all.patch instead, and still wasn't able to trigger the reboot or 
console message, no matter how hard I tried.


I guess for now I'll sit on it, and if it appears in the next -mm it'll probably 
annoy me enough and inspire me to dig deeper (or, "guess" deeper, given the lack 
of direction as to where to even begin).


Reuben
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc4-mm1

2007-12-11 Thread Reuben Farrelly



On 11/12/2007 8:11 AM, Andrew Morton wrote:

On Tue, 11 Dec 2007 01:48:39 +1100
Reuben Farrelly [EMAIL PROTECTED] wrote:



On 5/12/2007 4:17 PM, Andrew Morton wrote:

Temporarily at

  http://userweb.kernel.org/~akpm/2.6.24-rc4-mm1/

Will appear later at

  
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc4/2.6.24-rc4-mm1/


- Lots of device IDs have been removed from the e1000 driver and moved over
  to e1000e.  So if your e1000 stops working, you forgot to set CONFIG_E1000E.

- The s390 build is still broken.

I'm seeing this most incredibly unhelpful (to debug) but fortunately
reproduceable problem (so far 4/4 times) on this -mm kernel.  I thought this 
problem may have been related to another bug which I have reported (A TCP oops) 
but even after applying a likely fix for that I am still seeing this problem.


The machine boots up perfectly fine and runs good until I load it up.
In this case I can reliably cause this to occur by pulling a 3G ISO across the
GigE network from my Linux box to my PC.  After maybe 50M or so, the console 
just displays this (ignore initial boot banner):


--

  * Starting local ... [ ok 
]


This is tornado.reub.net (Linux x86_64 2.6.24-rc4-mm1) 00:24:01

tornado login: *** buffer overf

---

Yes - after displaying the 'f' in what I can only guess is the word 'overflow',
the box spontaneously reboots.  There is no further console output until it 
starts to come back up again.


The problem does not exist in 2.6.23-gentoo kernels nor in a vanilla 
2.6.24-rc4-git6 (phew!), so this looks to be an -mm only problem at this stage.


I enabled a number of kernel debugging options but then I got no output at all 
when the machine crashed.


I'm at a bit of a loss as to which subsystem this might be coming from, so I'm 
not sure who to CC.


Box information is (still) up at 
http://www.reub.net/files/kernel/2.6.24-rc4-mm1/



hm.  grepping around for buffer overflow doesn't turn up anything except in
drivers which you won't be using on that machine.

I'd be suspecting networking, obviously.  If you're feeling keen could you 
please
grep a 2.6.24-rc4 tree and apply 2.6.24-rc4-mm1's origin.patch and git-net.patch
and see if the bug is still present?


No - seems to be fine with just origin.patch and git-net.patch.

Just for good measure I then reverted git-net.patch and applied 
git-netdev-all.patch instead, and still wasn't able to trigger the reboot or 
console message, no matter how hard I tried.


I guess for now I'll sit on it, and if it appears in the next -mm it'll probably 
annoy me enough and inspire me to dig deeper (or, guess deeper, given the lack 
of direction as to where to even begin).


Reuben
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc4-mm1

2007-12-10 Thread Reuben Farrelly



On 5/12/2007 4:17 PM, Andrew Morton wrote:

Temporarily at

  http://userweb.kernel.org/~akpm/2.6.24-rc4-mm1/

Will appear later at

  
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc4/2.6.24-rc4-mm1/


- Lots of device IDs have been removed from the e1000 driver and moved over
  to e1000e.  So if your e1000 stops working, you forgot to set CONFIG_E1000E.

- The s390 build is still broken.


I'm seeing this most incredibly unhelpful (to debug) but fortunately
reproduceable problem (so far 4/4 times) on this -mm kernel.  I thought this 
problem may have been related to another bug which I have reported (A TCP oops) 
but even after applying a likely fix for that I am still seeing this problem.


The machine boots up perfectly fine and runs good until I load it up.
In this case I can reliably cause this to occur by pulling a 3G ISO across the
GigE network from my Linux box to my PC.  After maybe 50M or so, the console 
just displays this (ignore initial boot banner):


--

 * Starting local ... [ ok ]


This is tornado.reub.net (Linux x86_64 2.6.24-rc4-mm1) 00:24:01

tornado login: *** buffer overf

---

Yes - after displaying the 'f' in what I can only guess is the word 'overflow',
the box spontaneously reboots.  There is no further console output until it 
starts to come back up again.


The problem does not exist in 2.6.23-gentoo kernels nor in a vanilla 
2.6.24-rc4-git6 (phew!), so this looks to be an -mm only problem at this stage.


I enabled a number of kernel debugging options but then I got no output at all 
when the machine crashed.


I'm at a bit of a loss as to which subsystem this might be coming from, so I'm 
not sure who to CC.


Box information is (still) up at 
http://www.reub.net/files/kernel/2.6.24-rc4-mm1/

Thanks,
Reuben




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc4-mm1

2007-12-10 Thread Reuben Farrelly



On 5/12/2007 4:17 PM, Andrew Morton wrote:

Temporarily at

  http://userweb.kernel.org/~akpm/2.6.24-rc4-mm1/

Will appear later at

  
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc4/2.6.24-rc4-mm1/


- Lots of device IDs have been removed from the e1000 driver and moved over
  to e1000e.  So if your e1000 stops working, you forgot to set CONFIG_E1000E.

- The s390 build is still broken.


I'm seeing this most incredibly unhelpful (to debug) but fortunately
reproduceable problem (so far 4/4 times) on this -mm kernel.  I thought this 
problem may have been related to another bug which I have reported (A TCP oops) 
but even after applying a likely fix for that I am still seeing this problem.


The machine boots up perfectly fine and runs good until I load it up.
In this case I can reliably cause this to occur by pulling a 3G ISO across the
GigE network from my Linux box to my PC.  After maybe 50M or so, the console 
just displays this (ignore initial boot banner):


--

 * Starting local ... [ ok ]


This is tornado.reub.net (Linux x86_64 2.6.24-rc4-mm1) 00:24:01

tornado login: *** buffer overf

---

Yes - after displaying the 'f' in what I can only guess is the word 'overflow',
the box spontaneously reboots.  There is no further console output until it 
starts to come back up again.


The problem does not exist in 2.6.23-gentoo kernels nor in a vanilla 
2.6.24-rc4-git6 (phew!), so this looks to be an -mm only problem at this stage.


I enabled a number of kernel debugging options but then I got no output at all 
when the machine crashed.


I'm at a bit of a loss as to which subsystem this might be coming from, so I'm 
not sure who to CC.


Box information is (still) up at 
http://www.reub.net/files/kernel/2.6.24-rc4-mm1/

Thanks,
Reuben




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc4-mm1

2007-12-05 Thread Reuben Farrelly

On 5/12/2007 4:17 PM, Andrew Morton wrote:

Temporarily at

  http://userweb.kernel.org/~akpm/2.6.24-rc4-mm1/

Will appear later at

  
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc4/2.6.24-rc4-mm1/


- Lots of device IDs have been removed from the e1000 driver and moved over
  to e1000e.  So if your e1000 stops working, you forgot to set CONFIG_E1000E.


This non fatal oops which I have just noticed may be related to this change then 
- certainly looks networking related.


WARNING: at net/ipv4/tcp_input.c:2518 tcp_fastretrans_alert()
Pid: 0, comm: swapper Not tainted 2.6.24-rc4-mm1 #1

Call Trace:
   [] tcp_fastretrans_alert+0x229/0xe63
 [] tcp_ack+0xa3f/0x127d
 [] tcp_rcv_established+0x55f/0x7f8
 [] tcp_v4_do_rcv+0xdb/0x3a7
 [] :nf_conntrack:nf_ct_deliver_cached_events+0x75/0x99
 [] :nf_conntrack_ipv4:ipv4_confirm+0x29/0x51
 [] tcp_v4_rcv+0x9be/0xaed
 [] nf_hook_slow+0x60/0xdf
 [] ip_local_deliver_finish+0xd3/0x253
 [] ip_local_deliver+0x3b/0x85
 [] ip_rcv_finish+0x119/0x3b8
 [] ip_rcv+0x231/0x30c
 [] netif_receive_skb+0x215/0x299
 [] :e1000e:e1000_receive_skb+0x4d/0x1db
 [] :e1000e:e1000_clean_rx_irq+0x12c/0x341
 [] :e1000e:e1000_clean+0x306/0x58f
 [] rebalance_domains+0xec/0x423
 [] handle_edge_irq+0x97/0x13b
 [] net_rx_action+0xb8/0x11d
 [] __do_softirq+0x71/0xdd
 [] call_softirq+0x1c/0x30
 [] do_softirq+0x3d/0x8d
 [] irq_exit+0x84/0x86
 [] do_IRQ+0x7e/0xe4
 [] mwait_idle+0x0/0x58
 [] default_idle+0x0/0x43
 [] ret_from_intr+0x0/0xa
   [] mwait_idle+0x48/0x58
 [] enter_idle+0x22/0x24
 [] cpu_idle+0x63/0x88
 [] rest_init+0x55/0x60
 [] start_kernel+0x2a4/0x32a
 [] _sinittext+0x10b/0x120

tornado home #

I have posted a full dmesg up as well as my .config and an lcpci at 
http://www.reub.net/files/kernel/2.6.24-rc4-mm1/ .


Thanks,
Reuben
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc4-mm1

2007-12-05 Thread Reuben Farrelly

On 5/12/2007 4:17 PM, Andrew Morton wrote:

Temporarily at

  http://userweb.kernel.org/~akpm/2.6.24-rc4-mm1/

Will appear later at

  
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc4/2.6.24-rc4-mm1/


- Lots of device IDs have been removed from the e1000 driver and moved over
  to e1000e.  So if your e1000 stops working, you forgot to set CONFIG_E1000E.


This non fatal oops which I have just noticed may be related to this change then 
- certainly looks networking related.


WARNING: at net/ipv4/tcp_input.c:2518 tcp_fastretrans_alert()
Pid: 0, comm: swapper Not tainted 2.6.24-rc4-mm1 #1

Call Trace:
 IRQ  [8046e038] tcp_fastretrans_alert+0x229/0xe63
 [80470975] tcp_ack+0xa3f/0x127d
 [804747b7] tcp_rcv_established+0x55f/0x7f8
 [8047b1aa] tcp_v4_do_rcv+0xdb/0x3a7
 [881148a8] :nf_conntrack:nf_ct_deliver_cached_events+0x75/0x99
 [88120179] :nf_conntrack_ipv4:ipv4_confirm+0x29/0x51
 [8047db71] tcp_v4_rcv+0x9be/0xaed
 [80455eaa] nf_hook_slow+0x60/0xdf
 [8045db6b] ip_local_deliver_finish+0xd3/0x253
 [8045e146] ip_local_deliver+0x3b/0x85
 [8045d7f9] ip_rcv_finish+0x119/0x3b8
 [8045e030] ip_rcv+0x231/0x30c
 [8043ef39] netif_receive_skb+0x215/0x299
 [880b82b9] :e1000e:e1000_receive_skb+0x4d/0x1db
 [880bc200] :e1000e:e1000_clean_rx_irq+0x12c/0x341
 [880ba31a] :e1000e:e1000_clean+0x306/0x58f
 [8022a16a] rebalance_domains+0xec/0x423
 [80261332] handle_edge_irq+0x97/0x13b
 [804412d3] net_rx_action+0xb8/0x11d
 [802344f8] __do_softirq+0x71/0xdd
 [8020c8fc] call_softirq+0x1c/0x30
 [8020e7a5] do_softirq+0x3d/0x8d
 [80234485] irq_exit+0x84/0x86
 [8020e89e] do_IRQ+0x7e/0xe4
 [8020a908] mwait_idle+0x0/0x58
 [8020a7f1] default_idle+0x0/0x43
 [8020bc81] ret_from_intr+0x0/0xa
 EOI  [8020a950] mwait_idle+0x48/0x58
 [80209f23] enter_idle+0x22/0x24
 [8020a897] cpu_idle+0x63/0x88
 [804ada75] rest_init+0x55/0x60
 [80627b9a] start_kernel+0x2a4/0x32a
 [8062710b] _sinittext+0x10b/0x120

tornado home #

I have posted a full dmesg up as well as my .config and an lcpci at 
http://www.reub.net/files/kernel/2.6.24-rc4-mm1/ .


Thanks,
Reuben
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc7-mm1

2007-09-24 Thread Reuben Farrelly



On 25/09/2007 3:12 AM, J. Bruce Fields wrote:

On Mon, Sep 24, 2007 at 09:59:29AM -0700, Andrew Morton wrote:

On Tue, 25 Sep 2007 00:52:30 +1000 Reuben Farrelly <[EMAIL PROTECTED]> wrote:



On 24/09/2007 7:17 PM, Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc7/2.6.23-rc7-mm1/

- New git tree git-powerpc-galak.patch added to the -mm lineup: ppc32
  things, mainly (Kumar Gala <[EMAIL PROTECTED]>)
I'm observing a problem with this kernel (as well as 2.6.23-rc6-mm1) which 
manifests itself only in my Postfix/application mail.logs:


Sep 25 00:25:40 tornado postfix/smtp[12520]: fatal: select lock: Cannot allocate 
memory
Sep 25 00:25:41 tornado postfix/master[8002]: warning: process 
/usr/lib64/postfix/smtp pid 12520 exit status 1


This is happening frequently with processes started via 'master' (smtp, smtpd 
and cleanup), but it does not appear to have any noticeable operational impact 
apart from logging a lot of copies of this message.


The corresponding code in Postfix which triggers this is (choice of 3 files in 
src/master are all possibilities which all have much the same code)


Oog.  Looks like it's the "Memory shortage can result in inconsistent
flocks state" patch--the error variable is being set in some cases when
it shouldn't be.  Does the following fix it?

That's in my git tree, not in mainline.  I'll fix up my copy.

And I'll spend some time today figuring out what to do about regression
testing for the posix lock, flock, and lease code.

Thanks for the bug report!

--b.

diff --git a/fs/locks.c b/fs/locks.c
index a6c5917..3e8bfd2 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -740,6 +740,7 @@ static int flock_lock_file(struct file *filp, struct 
file_lock *request)
new_fl = locks_alloc_lock();
if (new_fl == NULL)
goto out;
+   error = 0;
}
 
 	for_each_lock(inode, before) {


Yes that has fixed it, thanks!

Reuben
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc7-mm1

2007-09-24 Thread Reuben Farrelly



On 24/09/2007 7:17 PM, Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc7/2.6.23-rc7-mm1/

- New git tree git-powerpc-galak.patch added to the -mm lineup: ppc32
  things, mainly (Kumar Gala <[EMAIL PROTECTED]>)


I'm observing a problem with this kernel (as well as 2.6.23-rc6-mm1) which 
manifests itself only in my Postfix/application mail.logs:


Sep 25 00:25:40 tornado postfix/smtp[12520]: fatal: select lock: Cannot allocate 
memory
Sep 25 00:25:41 tornado postfix/master[8002]: warning: process 
/usr/lib64/postfix/smtp pid 12520 exit status 1


This is happening frequently with processes started via 'master' (smtp, smtpd 
and cleanup), but it does not appear to have any noticeable operational impact 
apart from logging a lot of copies of this message.


The corresponding code in Postfix which triggers this is (choice of 3 files in 
src/master are all possibilities which all have much the same code)


/*
 * The event loop, at last.
 */
while (var_use_limit == 0 || use_count < var_use_limit || client_count > 0) 
{
if (multi_server_lock != 0) {
watchdog_stop(watchdog);
if (myflock(vstream_fileno(multi_server_lock), INTERNAL_LOCK,
MYFLOCK_OP_EXCLUSIVE) < 0)
msg_fatal("select lock: %m");
}
watchdog_start(watchdog);
delay = loop ? loop(multi_server_name, multi_server_argv) : -1;
event_loop(delay);
}
multi_server_exit();
}


Now I'm not convinced this is an application problem, because I'm only seeing 
this after running up kernel 2.6.23-rc6-mm1 or 2.6.23-rc7-mm1 and with NO 
changes to the application itself.  Using the same application binaries it does 
not occur with 2.6.22 mainline.  [I didn't get a lot of testing with the -mm 
release prior to that unfortunately due to some other breakage.]


Is there anything new in the last two or so -mm kernels which could have caused 
this?


I've put my .config up at http://www.reub.net/files/kernel/2.6.23-rc7-mm1.config

Thanks,
Reuben
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc7-mm1

2007-09-24 Thread Reuben Farrelly



On 24/09/2007 7:17 PM, Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc7/2.6.23-rc7-mm1/

- New git tree git-powerpc-galak.patch added to the -mm lineup: ppc32
  things, mainly (Kumar Gala [EMAIL PROTECTED])


I'm observing a problem with this kernel (as well as 2.6.23-rc6-mm1) which 
manifests itself only in my Postfix/application mail.logs:


Sep 25 00:25:40 tornado postfix/smtp[12520]: fatal: select lock: Cannot allocate 
memory
Sep 25 00:25:41 tornado postfix/master[8002]: warning: process 
/usr/lib64/postfix/smtp pid 12520 exit status 1


This is happening frequently with processes started via 'master' (smtp, smtpd 
and cleanup), but it does not appear to have any noticeable operational impact 
apart from logging a lot of copies of this message.


The corresponding code in Postfix which triggers this is (choice of 3 files in 
src/master are all possibilities which all have much the same code)


/*
 * The event loop, at last.
 */
while (var_use_limit == 0 || use_count  var_use_limit || client_count  0) 
{
if (multi_server_lock != 0) {
watchdog_stop(watchdog);
if (myflock(vstream_fileno(multi_server_lock), INTERNAL_LOCK,
MYFLOCK_OP_EXCLUSIVE)  0)
msg_fatal(select lock: %m);
}
watchdog_start(watchdog);
delay = loop ? loop(multi_server_name, multi_server_argv) : -1;
event_loop(delay);
}
multi_server_exit();
}


Now I'm not convinced this is an application problem, because I'm only seeing 
this after running up kernel 2.6.23-rc6-mm1 or 2.6.23-rc7-mm1 and with NO 
changes to the application itself.  Using the same application binaries it does 
not occur with 2.6.22 mainline.  [I didn't get a lot of testing with the -mm 
release prior to that unfortunately due to some other breakage.]


Is there anything new in the last two or so -mm kernels which could have caused 
this?


I've put my .config up at http://www.reub.net/files/kernel/2.6.23-rc7-mm1.config

Thanks,
Reuben
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc7-mm1

2007-09-24 Thread Reuben Farrelly



On 25/09/2007 3:12 AM, J. Bruce Fields wrote:

On Mon, Sep 24, 2007 at 09:59:29AM -0700, Andrew Morton wrote:

On Tue, 25 Sep 2007 00:52:30 +1000 Reuben Farrelly [EMAIL PROTECTED] wrote:



On 24/09/2007 7:17 PM, Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc7/2.6.23-rc7-mm1/

- New git tree git-powerpc-galak.patch added to the -mm lineup: ppc32
  things, mainly (Kumar Gala [EMAIL PROTECTED])
I'm observing a problem with this kernel (as well as 2.6.23-rc6-mm1) which 
manifests itself only in my Postfix/application mail.logs:


Sep 25 00:25:40 tornado postfix/smtp[12520]: fatal: select lock: Cannot allocate 
memory
Sep 25 00:25:41 tornado postfix/master[8002]: warning: process 
/usr/lib64/postfix/smtp pid 12520 exit status 1


This is happening frequently with processes started via 'master' (smtp, smtpd 
and cleanup), but it does not appear to have any noticeable operational impact 
apart from logging a lot of copies of this message.


The corresponding code in Postfix which triggers this is (choice of 3 files in 
src/master are all possibilities which all have much the same code)


Oog.  Looks like it's the Memory shortage can result in inconsistent
flocks state patch--the error variable is being set in some cases when
it shouldn't be.  Does the following fix it?

That's in my git tree, not in mainline.  I'll fix up my copy.

And I'll spend some time today figuring out what to do about regression
testing for the posix lock, flock, and lease code.

Thanks for the bug report!

--b.

diff --git a/fs/locks.c b/fs/locks.c
index a6c5917..3e8bfd2 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -740,6 +740,7 @@ static int flock_lock_file(struct file *filp, struct 
file_lock *request)
new_fl = locks_alloc_lock();
if (new_fl == NULL)
goto out;
+   error = 0;
}
 
 	for_each_lock(inode, before) {


Yes that has fixed it, thanks!

Reuben
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Serial port bug?] was Re: 2.6.22-rc4-mm2

2007-06-13 Thread Reuben Farrelly

On 7/06/2007 3:03 PM, Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc4/2.6.22-rc4-mm2/

- Basically a bugfixed version of 2.6.22-rc4-mm1.  None of the subsystem
  trees were repulled, several bad patches were dropped, a few were fixed.


I've come home to find my server has locked up hard, with a panic on the screen. 
 This time unlike others, I was able to grab a photo of it for further analysis.


http://www.reub.net/files/kernel/ serial-crash.jpg

[Note also the .config and dmesg in the same directory]

I have had this or a very similar traceback appear about 3 or 4 times now, 
including with a 2.6.21-gentoo kernel (based on mainline), so this bug may well 
be present in mainline.  It is not new to this -mm release.


The bug does not occur on demand, it just seems to happen every few days without 
obvious warning, I haven't reported it until now as I haven't had any other 
information to provide other than "some panic seems to happen with a tty_write 
something-or-other".


The other possibly crucial piece of information on this is that I have one of my 
serial ports set up as a serial console.  The kernel boot commands for this are:


kernel /vmlinuz-2.6.22-rc4-mm2 ro real_root=/dev/md2 console=tty0 
console=ttyS0,57600 panic=30


as well as this:

# SERIAL CONSOLES
s0:12345:respawn:/sbin/agetty 57600 ttyS0 vt100

in inittab.

The other serial port is connected up to my APC UPS and is set up with apcupsd.

Reuben
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Serial port bug?] was Re: 2.6.22-rc4-mm2

2007-06-13 Thread Reuben Farrelly

On 7/06/2007 3:03 PM, Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc4/2.6.22-rc4-mm2/

- Basically a bugfixed version of 2.6.22-rc4-mm1.  None of the subsystem
  trees were repulled, several bad patches were dropped, a few were fixed.


I've come home to find my server has locked up hard, with a panic on the screen. 
 This time unlike others, I was able to grab a photo of it for further analysis.


http://www.reub.net/files/kernel/ serial-crash.jpg

[Note also the .config and dmesg in the same directory]

I have had this or a very similar traceback appear about 3 or 4 times now, 
including with a 2.6.21-gentoo kernel (based on mainline), so this bug may well 
be present in mainline.  It is not new to this -mm release.


The bug does not occur on demand, it just seems to happen every few days without 
obvious warning, I haven't reported it until now as I haven't had any other 
information to provide other than some panic seems to happen with a tty_write 
something-or-other.


The other possibly crucial piece of information on this is that I have one of my 
serial ports set up as a serial console.  The kernel boot commands for this are:


kernel /vmlinuz-2.6.22-rc4-mm2 ro real_root=/dev/md2 console=tty0 
console=ttyS0,57600 panic=30


as well as this:

# SERIAL CONSOLES
s0:12345:respawn:/sbin/agetty 57600 ttyS0 vt100

in inittab.

The other serial port is connected up to my APC UPS and is set up with apcupsd.

Reuben
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc1-mm1 - Call trace in slub_def.h

2007-05-17 Thread Reuben Farrelly

On 16/05/2007 1:19 PM, Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc1/2.6.22-rc1-mm1/


- I found some time to look into some writeback problems in
  fs/fs-writeback.c.  The results were ugly.  There are a pile of fixes here
  but more work (mainly testing) needs to be done.

  There's some new debug code in there which could be very expensive if
  there are a lot of dirty inodes in the machine (quadratic behaviour).  If
  the machine seems to be affected by this, the debugging may be disabled with

echo 0 > /proc/sys/fs/inode_debug

- Added an i386 early-startup development tree, as git-newsetup.patch ("H. 
  Peter Anvin" <[EMAIL PROTECTED]>)


- Brought back git-sas.patch (Darrick J.  Wong <[EMAIL PROTECTED]>).  It got
  lost quite some time ago.


I have just seen this on boot, with 2.6.22-rc2-mm1 on x86_64:

--

libata version 2.20 loaded.
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report
BUG: at include/linux/slub_def.h:88 kmalloc_index()

Call Trace:
 [] pci_dev_put+0x12/0x14
 [] get_slab+0xb5/0x265
 [] __kmalloc+0x13/0xa3
 [] cache_k8_northbridges+0x80/0x116
 [] gart_iommu_init+0x16/0x594
 [] genl_rcv+0x0/0x68
 [] netlink_kernel_create+0x15e/0x16b
 [] mutex_unlock+0x9/0xb
 [] pci_iommu_init+0x9/0x12
 [] kernel_init+0x152/0x322
 [] trace_hardirqs_on+0xc0/0x14e
 [] trace_hardirqs_on_thunk+0x35/0x37
 [] trace_hardirqs_on+0xc0/0x14e
 [] child_rip+0xa/0x12
 [] restore_args+0x0/0x30
 [] kernel_init+0x0/0x322
 [] child_rip+0x0/0x12

PCI-GART: No AMD northbridge found.
hpet0: at MMIO 0xfed0, IRQs 2, 8, 0
hpet0: 3 64-bit timers, 14318180 Hz
ACPI: RTC can wake from S4
pnp: 00:01: iomem range 0xf000-0xf3ff has been reserved
pnp: 00:01: iomem range 0xfed13000-0xfed13fff has been reserved

--

The full dmesg is at http://www.reub.net/files/kernel/2.6.22-rc1-mm1-dmesg and 
the config up at http://www.reub.net/files/kernel/2.6.22-rc1-mm1-config


The machine otherwise seems to run OK.

Reuben
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc1-mm1 - Call trace in slub_def.h

2007-05-17 Thread Reuben Farrelly

On 16/05/2007 1:19 PM, Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc1/2.6.22-rc1-mm1/


- I found some time to look into some writeback problems in
  fs/fs-writeback.c.  The results were ugly.  There are a pile of fixes here
  but more work (mainly testing) needs to be done.

  There's some new debug code in there which could be very expensive if
  there are a lot of dirty inodes in the machine (quadratic behaviour).  If
  the machine seems to be affected by this, the debugging may be disabled with

echo 0  /proc/sys/fs/inode_debug

- Added an i386 early-startup development tree, as git-newsetup.patch (H. 
  Peter Anvin [EMAIL PROTECTED])


- Brought back git-sas.patch (Darrick J.  Wong [EMAIL PROTECTED]).  It got
  lost quite some time ago.


I have just seen this on boot, with 2.6.22-rc2-mm1 on x86_64:

--

libata version 2.20 loaded.
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try pci=routeirq.  If it helps, post a report
BUG: at include/linux/slub_def.h:88 kmalloc_index()

Call Trace:
 [8034f3f9] pci_dev_put+0x12/0x14
 [80283f30] get_slab+0xb5/0x265
 [802841bc] __kmalloc+0x13/0xa3
 [8021a4aa] cache_k8_northbridges+0x80/0x116
 [8063fed2] gart_iommu_init+0x16/0x594
 [804562ac] genl_rcv+0x0/0x68
 [804548ed] netlink_kernel_create+0x15e/0x16b
 [804acc52] mutex_unlock+0x9/0xb
 [80639fad] pci_iommu_init+0x9/0x12
 [806306af] kernel_init+0x152/0x322
 [80249c7c] trace_hardirqs_on+0xc0/0x14e
 [804ae03d] trace_hardirqs_on_thunk+0x35/0x37
 [80249c7c] trace_hardirqs_on+0xc0/0x14e
 [8020a848] child_rip+0xa/0x12
 [80209f5c] restore_args+0x0/0x30
 [8063055d] kernel_init+0x0/0x322
 [8020a83e] child_rip+0x0/0x12

PCI-GART: No AMD northbridge found.
hpet0: at MMIO 0xfed0, IRQs 2, 8, 0
hpet0: 3 64-bit timers, 14318180 Hz
ACPI: RTC can wake from S4
pnp: 00:01: iomem range 0xf000-0xf3ff has been reserved
pnp: 00:01: iomem range 0xfed13000-0xfed13fff has been reserved

--

The full dmesg is at http://www.reub.net/files/kernel/2.6.22-rc1-mm1-dmesg and 
the config up at http://www.reub.net/files/kernel/2.6.22-rc1-mm1-config


The machine otherwise seems to run OK.

Reuben
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RAID1 "out of memory" error, was Re: 2.6.21-rc5-mm4

2007-04-05 Thread Reuben Farrelly

Hi,

On 3/04/2007 3:47 PM, Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm4/

- The oops in git-net.patch has been fixed, so that tree has been restored. 
  It is huge.


- Added the device-mapper development tree to the -mm lineup (Alasdair
  Kergon).  It is a quilt tree, living at
  ftp://ftp.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/.

- Added davidel's signalfd stuff.


Looks like some damage, or maybe intolerance to on-disk damage, to RAID-1.

md1 is the first array on the disk, and it refuses to start up on boot, or after 
boot.


tornado ~ # cat /proc/mdstat
Personalities : [raid1]
md1 : inactive sda1[0] sdc1[1]
  208640 blocks

md3 : active raid1 sdc3[1] sda3[0]
  20008832 blocks [2/2] [UU]
  bitmap: 0/153 pages [0KB], 64KB chunk

md5 : active raid1 sdc5[1] sda5[0]
  10008384 blocks [2/2] [UU]
  bitmap: 4/153 pages [16KB], 32KB chunk

md6 : active raid1 sdc6[1] sda6[0]
  10008384 blocks [2/2] [UU]
  bitmap: 0/153 pages [0KB], 32KB chunk

md8 : active raid1 sdc8[1] sda8[0]
  1003904 blocks [2/2] [UU]
  bitmap: 0/123 pages [0KB], 4KB chunk

md10 : active raid1 sdc10[1] sda10[0]
  119933120 blocks [2/2] [UU]
  bitmap: 1/229 pages [4KB], 256KB chunk

md2 : active raid1 sdc2[1] sda2[0]
  14544 blocks [2/2] [UU]
  bitmap: 10/191 pages [40KB], 256KB chunk

unused devices: 
tornado ~ #

tornado ~ # mdadm --examine /dev/sda1
/dev/sda1:
  Magic : a92b4efc
Version : 00.90.00
   UUID : f5c2e565:5ed956c0:33b08c07:16154426
  Creation Time : Fri Feb  2 10:16:29 2007
 Raid Level : raid1
  Used Dev Size : 104320 (101.89 MiB 106.82 MB)
 Array Size : 104320 (101.89 MiB 106.82 MB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 1

Update Time : Fri Apr  6 02:06:17 2007
  State : clean
Internal Bitmap : present
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0
   Checksum : d3668aaa - correct
 Events : 0.368


  Number   Major   Minor   RaidDevice State
this 0   810  active sync   /dev/sda1

   0 0   810  active sync   /dev/sda1
   1 1   8   331  active sync   /dev/sdc1
tornado ~ # mdadm --examine /dev/sdc1
/dev/sdc1:
  Magic : a92b4efc
Version : 00.90.00
   UUID : f5c2e565:5ed956c0:33b08c07:16154426
  Creation Time : Fri Feb  2 10:16:29 2007
 Raid Level : raid1
  Used Dev Size : 104320 (101.89 MiB 106.82 MB)
 Array Size : 104320 (101.89 MiB 106.82 MB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 1

Update Time : Fri Apr  6 02:06:17 2007
  State : clean
Internal Bitmap : present
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0
   Checksum : d3668acc - correct
 Events : 0.368


  Number   Major   Minor   RaidDevice State
this 1   8   331  active sync   /dev/sdc1

   0 0   810  active sync   /dev/sda1
   1 1   8   331  active sync   /dev/sdc1
tornado ~ #


tornado ~ # mdadm --assemble /dev/md1 /dev/sda1 /dev/sdc1
mdadm: device /dev/md1 already active - cannot assemble it
tornado ~ # mdadm --run /dev/md1
mdadm: failed to run array /dev/md1: Cannot allocate memory
tornado ~ #

and looking at a dmesg, this is logged:

md: bind
md: bind
raid1: raid set md1 active with 2 out of 2 mirrors
md1: bitmap initialized from disk: read 0/1 pages, set 0 bits, status: -12
md1: failed to create bitmap (-12)
md: pers->run() failed ...

tornado ~ # uname -a
Linux tornado 2.6.21-rc5-mm4 #1 SMP Thu Apr 5 23:47:42 EST 2007 x86_64 Intel(R) 
Pentium(R) 4 CPU 3.00GHz GenuineIntel GNU/Linux

tornado ~ #

The last known version that worked was 2.6.21-rc3-mm1 - I haven't been testing 
out the -mm releases so much lately.


Also, Andrew, can you please restart posting/cc'ing your -mm announcements to 
the [EMAIL PROTECTED] list?  Seems this stopped around about 
2.6.20, it was handy.


.config is up at http://www.reub.net/files/kernel/configs/2.6.21-rc5-mm4

Thanks,
Reuben
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RAID1 out of memory error, was Re: 2.6.21-rc5-mm4

2007-04-05 Thread Reuben Farrelly

Hi,

On 3/04/2007 3:47 PM, Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm4/

- The oops in git-net.patch has been fixed, so that tree has been restored. 
  It is huge.


- Added the device-mapper development tree to the -mm lineup (Alasdair
  Kergon).  It is a quilt tree, living at
  ftp://ftp.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/.

- Added davidel's signalfd stuff.


Looks like some damage, or maybe intolerance to on-disk damage, to RAID-1.

md1 is the first array on the disk, and it refuses to start up on boot, or after 
boot.


tornado ~ # cat /proc/mdstat
Personalities : [raid1]
md1 : inactive sda1[0] sdc1[1]
  208640 blocks

md3 : active raid1 sdc3[1] sda3[0]
  20008832 blocks [2/2] [UU]
  bitmap: 0/153 pages [0KB], 64KB chunk

md5 : active raid1 sdc5[1] sda5[0]
  10008384 blocks [2/2] [UU]
  bitmap: 4/153 pages [16KB], 32KB chunk

md6 : active raid1 sdc6[1] sda6[0]
  10008384 blocks [2/2] [UU]
  bitmap: 0/153 pages [0KB], 32KB chunk

md8 : active raid1 sdc8[1] sda8[0]
  1003904 blocks [2/2] [UU]
  bitmap: 0/123 pages [0KB], 4KB chunk

md10 : active raid1 sdc10[1] sda10[0]
  119933120 blocks [2/2] [UU]
  bitmap: 1/229 pages [4KB], 256KB chunk

md2 : active raid1 sdc2[1] sda2[0]
  14544 blocks [2/2] [UU]
  bitmap: 10/191 pages [40KB], 256KB chunk

unused devices: none
tornado ~ #

tornado ~ # mdadm --examine /dev/sda1
/dev/sda1:
  Magic : a92b4efc
Version : 00.90.00
   UUID : f5c2e565:5ed956c0:33b08c07:16154426
  Creation Time : Fri Feb  2 10:16:29 2007
 Raid Level : raid1
  Used Dev Size : 104320 (101.89 MiB 106.82 MB)
 Array Size : 104320 (101.89 MiB 106.82 MB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 1

Update Time : Fri Apr  6 02:06:17 2007
  State : clean
Internal Bitmap : present
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0
   Checksum : d3668aaa - correct
 Events : 0.368


  Number   Major   Minor   RaidDevice State
this 0   810  active sync   /dev/sda1

   0 0   810  active sync   /dev/sda1
   1 1   8   331  active sync   /dev/sdc1
tornado ~ # mdadm --examine /dev/sdc1
/dev/sdc1:
  Magic : a92b4efc
Version : 00.90.00
   UUID : f5c2e565:5ed956c0:33b08c07:16154426
  Creation Time : Fri Feb  2 10:16:29 2007
 Raid Level : raid1
  Used Dev Size : 104320 (101.89 MiB 106.82 MB)
 Array Size : 104320 (101.89 MiB 106.82 MB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 1

Update Time : Fri Apr  6 02:06:17 2007
  State : clean
Internal Bitmap : present
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0
   Checksum : d3668acc - correct
 Events : 0.368


  Number   Major   Minor   RaidDevice State
this 1   8   331  active sync   /dev/sdc1

   0 0   810  active sync   /dev/sda1
   1 1   8   331  active sync   /dev/sdc1
tornado ~ #


tornado ~ # mdadm --assemble /dev/md1 /dev/sda1 /dev/sdc1
mdadm: device /dev/md1 already active - cannot assemble it
tornado ~ # mdadm --run /dev/md1
mdadm: failed to run array /dev/md1: Cannot allocate memory
tornado ~ #

and looking at a dmesg, this is logged:

md: bindsdc1
md: bindsda1
raid1: raid set md1 active with 2 out of 2 mirrors
md1: bitmap initialized from disk: read 0/1 pages, set 0 bits, status: -12
md1: failed to create bitmap (-12)
md: pers-run() failed ...

tornado ~ # uname -a
Linux tornado 2.6.21-rc5-mm4 #1 SMP Thu Apr 5 23:47:42 EST 2007 x86_64 Intel(R) 
Pentium(R) 4 CPU 3.00GHz GenuineIntel GNU/Linux

tornado ~ #

The last known version that worked was 2.6.21-rc3-mm1 - I haven't been testing 
out the -mm releases so much lately.


Also, Andrew, can you please restart posting/cc'ing your -mm announcements to 
the [EMAIL PROTECTED] list?  Seems this stopped around about 
2.6.20, it was handy.


.config is up at http://www.reub.net/files/kernel/configs/2.6.21-rc5-mm4

Thanks,
Reuben
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc4-mm1

2007-03-21 Thread Reuben Farrelly

On 20/03/2007 3:56 PM, Andrew Morton wrote:

Temporarily at

  http://userweb.kernel.org/~akpm/2.6.21-rc4-mm1/

Will appear later at

  
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc4/2.6.21-rc4-mm1/



- Restored the RSDL CPU scheduler (a new version thereof)


Just booted into this kernel, and hit this, which locked up the machine:

This is tornado.reub.net (Linux x86_64 2.6.21-rc4-mm1) 20:16:58

tornado login: [ cut here ]
kernel BUG at kernel/sched.c:3505!
invalid opcode:  [1] SMP
last sysfs file: devices/pci:00/:00:1f.3/i2c-adapter/i2c-0/0-002e/pwm3
CPU 1
Modules linked in: firmware_class eeprom lm85 hwmon_vid i2c_i801 8021q 
iptable_filter iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nfnetlink 
iptable_mangle ip_tables nfs lockd sunrpc ohci1394 ieee1394 usb_storage

Pid: 8250, comm: clamd Not tainted 2.6.21-rc4-mm1 #1
RIP: 0010:[]  [] 
__sched_text_start+0x3cb/0x8b3
RSP: :8100023cfee0  EFLAGS: 00010002
RAX: 008c RBX: 810001e040e8 RCX: 000c
RDX:  RSI: 008c RDI: 810001e049b8
RBP: 8100023cff70 R08: 008c R09: 810001e049a8
R10: 0034 R11:  R12: 810001e03f00
R13: 0002 R14:  R15: 00521b55f827
FS:  2b1dfda2ec00() GS:81000208ec40() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 2afcf000 CR3: 04ac3000 CR4: 06e0
Process clamd (pid: 8250, threadinfo 8100023ce000, task 810004c090a0)
Stack:  810004c090a0 8025fdb7 810004c090a0 7fffae43e955
 810004c09248 0001023cff28 8029635d 00c5aac0
 0005 2b1dfc7d6d5a 8025fdb7 
Call Trace:
 [] trace_hardirqs_on_thunk+0x35/0x37
 [] trace_hardirqs_on+0x12a/0x15d
 [] trace_hardirqs_on_thunk+0x35/0x37
 [] retint_careful+0x12/0x2e


Code: 0f 0b eb fe 49 8b 94 24 e0 01 00 00 49 8b 84 24 d8 01 00 00
RIP  [] __sched_text_start+0x3cb/0x8b3
 RSP 
BUG: spinlock lockup on CPU#0, swapper/0, 810001e03f00
BUG: spinlock lockup on CPU#1, clamd/8250, 810001e03f00

every few minutes the last two lines would be repeated.

This kernel does not include the hotfixes (the gentoo portage ebuild for this 
release does not yet include them), however I am uncertain if they fix this 
problem or not anyway.


Also, what happened to the -mm announcements sent to 
[EMAIL PROTECTED]  Maybe I'm the only person to miss them :-)


Reuben
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc4-mm1

2007-03-21 Thread Reuben Farrelly

On 20/03/2007 3:56 PM, Andrew Morton wrote:

Temporarily at

  http://userweb.kernel.org/~akpm/2.6.21-rc4-mm1/

Will appear later at

  
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc4/2.6.21-rc4-mm1/



- Restored the RSDL CPU scheduler (a new version thereof)


Just booted into this kernel, and hit this, which locked up the machine:

This is tornado.reub.net (Linux x86_64 2.6.21-rc4-mm1) 20:16:58

tornado login: [ cut here ]
kernel BUG at kernel/sched.c:3505!
invalid opcode:  [1] SMP
last sysfs file: devices/pci:00/:00:1f.3/i2c-adapter/i2c-0/0-002e/pwm3
CPU 1
Modules linked in: firmware_class eeprom lm85 hwmon_vid i2c_i801 8021q 
iptable_filter iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nfnetlink 
iptable_mangle ip_tables nfs lockd sunrpc ohci1394 ieee1394 usb_storage

Pid: 8250, comm: clamd Not tainted 2.6.21-rc4-mm1 #1
RIP: 0010:[8025d2cb]  [8025d2cb] 
__sched_text_start+0x3cb/0x8b3
RSP: :8100023cfee0  EFLAGS: 00010002
RAX: 008c RBX: 810001e040e8 RCX: 000c
RDX:  RSI: 008c RDI: 810001e049b8
RBP: 8100023cff70 R08: 008c R09: 810001e049a8
R10: 0034 R11:  R12: 810001e03f00
R13: 0002 R14:  R15: 00521b55f827
FS:  2b1dfda2ec00() GS:81000208ec40() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 2afcf000 CR3: 04ac3000 CR4: 06e0
Process clamd (pid: 8250, threadinfo 8100023ce000, task 810004c090a0)
Stack:  810004c090a0 8025fdb7 810004c090a0 7fffae43e955
 810004c09248 0001023cff28 8029635d 00c5aac0
 0005 2b1dfc7d6d5a 8025fdb7 
Call Trace:
 [8025fdb7] trace_hardirqs_on_thunk+0x35/0x37
 [8029635d] trace_hardirqs_on+0x12a/0x15d
 [8025fdb7] trace_hardirqs_on_thunk+0x35/0x37
 [8025a7e0] retint_careful+0x12/0x2e


Code: 0f 0b eb fe 49 8b 94 24 e0 01 00 00 49 8b 84 24 d8 01 00 00
RIP  [8025d2cb] __sched_text_start+0x3cb/0x8b3
 RSP 8100023cfee0
BUG: spinlock lockup on CPU#0, swapper/0, 810001e03f00
BUG: spinlock lockup on CPU#1, clamd/8250, 810001e03f00

every few minutes the last two lines would be repeated.

This kernel does not include the hotfixes (the gentoo portage ebuild for this 
release does not yet include them), however I am uncertain if they fix this 
problem or not anyway.


Also, what happened to the -mm announcements sent to 
[EMAIL PROTECTED]  Maybe I'm the only person to miss them :-)


Reuben
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6.19-rc5-mm2] cpufreq: set policy->curfreq on initialization

2006-11-17 Thread Reuben Farrelly



On 16/11/2006 6:05 AM, Mattia Dongili wrote:

Check the correct variable and set policy->cur upon acpi-cpufreq
initialization to allow the userspace governor to be used as default.

Signed-off-by: Mattia Dongili <[EMAIL PROTECTED]>

---

Reuben, could you also try if this patch fixes the BUG()?
Thanks


It does, and all looks fine now, thanks.  Sorry for not getting back about it a 
little earlier.


Reuben



diff --git a/arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c 
b/arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c
index 18f4715..a630f94 100644
--- a/arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c
+++ b/arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c
@@ -699,14 +699,14 @@ static int acpi_cpufreq_cpu_init(struct
if (result)
goto err_freqfree;
 
-	switch (data->cpu_feature) {

+   switch (perf->control_register.space_id) {
case ACPI_ADR_SPACE_SYSTEM_IO:
/* Current speed is unknown and not detectable by IO port */
policy->cur = acpi_cpufreq_guess_freq(data, policy->cpu);
break;
case ACPI_ADR_SPACE_FIXED_HARDWARE:
acpi_cpufreq_driver.get = get_cur_freq_on_cpu;
-   get_cur_freq_on_cpu(cpu);
+   policy->cur = get_cur_freq_on_cpu(cpu);
break;
default:
break;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6.19-rc5-mm2] cpufreq: set policy-curfreq on initialization

2006-11-17 Thread Reuben Farrelly



On 16/11/2006 6:05 AM, Mattia Dongili wrote:

Check the correct variable and set policy-cur upon acpi-cpufreq
initialization to allow the userspace governor to be used as default.

Signed-off-by: Mattia Dongili [EMAIL PROTECTED]

---

Reuben, could you also try if this patch fixes the BUG()?
Thanks


It does, and all looks fine now, thanks.  Sorry for not getting back about it a 
little earlier.


Reuben



diff --git a/arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c 
b/arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c
index 18f4715..a630f94 100644
--- a/arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c
+++ b/arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c
@@ -699,14 +699,14 @@ static int acpi_cpufreq_cpu_init(struct
if (result)
goto err_freqfree;
 
-	switch (data-cpu_feature) {

+   switch (perf-control_register.space_id) {
case ACPI_ADR_SPACE_SYSTEM_IO:
/* Current speed is unknown and not detectable by IO port */
policy-cur = acpi_cpufreq_guess_freq(data, policy-cpu);
break;
case ACPI_ADR_SPACE_FIXED_HARDWARE:
acpi_cpufreq_driver.get = get_cur_freq_on_cpu;
-   get_cur_freq_on_cpu(cpu);
+   policy-cur = get_cur_freq_on_cpu(cpu);
break;
default:
break;

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-usb-devel] Re: 2.6.13-mm1

2005-09-05 Thread Reuben Farrelly

Hi Alan,

On 3/09/2005 3:19 a.m., Alan Stern wrote:

On Thu, 1 Sep 2005, Andrew Morton wrote:


Reuben Farrelly <[EMAIL PROTECTED]> wrote:



I'm also observing some USB messages logged:

Sep  2 13:26:22 tornado kernel: usb 5-1: new full speed USB device using 
uhci_hcd and address 13
Sep  2 13:26:22 tornado kernel: drivers/usb/class/usblp.c: usblp0: USB 
Bidirectional printer dev 13 if 0 alt 0 proto 2 vid 0x03F0 pid 0x6204
Sep  2 13:26:23 tornado kernel: hub 5-0:1.0: port 1 disabled by hub (EMI?), 
re-enabling...


This message means pretty much what it says: noise or something else 
caused the connection to be disabled.  In theory this could be caused by a 
problem with the host controller, the cable, or the printer.  Does this 
happen consistently with 2.6.13-mm1?  Did it happen with 2.6.12?


It may have just been a red herring, as I haven't had the problem appear 
since, nor had I seen it before then.  I've done multiple reboots, plug and 
unplugs to test since and all have been OK.


Thanks for taking the time to reply.

reuben
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-mm1: hangs during boot ...

2005-09-05 Thread Reuben Farrelly

Hi,

On 5/09/2005 4:32 a.m., James Bottomley wrote:

On Sun, 2005-09-04 at 01:24 +1200, Reuben Farrelly wrote:
I am seeing it fill up my messages log as it is logging 1 or so messages each 
minute.  I've emailed the SCSI maintainer James Bottomley twice about it but 
had no response either time.


OK, can you try this ... it should confirm the theory if the messages go
away.

Thanks,

James

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -315,7 +315,7 @@ int scsi_execute(struct scsi_device *sde
req->sense = sense;
req->sense_len = 0;
req->timeout = timeout;
-   req->flags |= flags | REQ_BLOCK_PC | REQ_SPECIAL;
+   req->flags |= flags | REQ_BLOCK_PC | REQ_SPECIAL | REQ_QUIET;
 
 	/*

 * head injection *required* here otherwise quiesce won't work
@@ -927,17 +927,20 @@ void scsi_io_completion(struct scsi_cmnd
scsi_requeue_command(q, cmd);
return;
}
-   printk(KERN_INFO "Device %s not ready.\n",
-  req->rq_disk ? req->rq_disk->disk_name : "");
+   if (!(req->flags & REQ_QUIET))
+   dev_printk(KERN_INFO,
+  >device->sdev_gendev,
+  "Device not ready.\n");
cmd = scsi_end_request(cmd, 0, this_count, 1);
return;
case VOLUME_OVERFLOW:
-   printk(KERN_INFO "Volume overflow <%d %d %d %d> CDB: ",
-  cmd->device->host->host_no,
-  (int)cmd->device->channel,
-  (int)cmd->device->id, (int)cmd->device->lun);
-   __scsi_print_command(cmd->data_cmnd);
-   scsi_print_sense("", cmd);
+   if (!(req->flags & REQ_QUIET)) {
+   dev_printk(KERN_INFO,
+  >device->sdev_gendev,
+  "Volume overflow, CDB: ");
+   __scsi_print_command(cmd->data_cmnd);
+   scsi_print_sense("", cmd);
+   }
cmd = scsi_end_request(cmd, 0, block_bytes, 1);
return;
default:
@@ -954,15 +957,13 @@ void scsi_io_completion(struct scsi_cmnd
return;
}
if (result) {
-   if (!(req->flags & REQ_SPECIAL))
-   printk(KERN_INFO "SCSI error : <%d %d %d %d> return code 
"
-  "= 0x%x\n", cmd->device->host->host_no,
-  cmd->device->channel,
-  cmd->device->id,
-  cmd->device->lun, result);
+   if (!(req->flags & REQ_QUIET)) {
+   dev_printk(KERN_INFO, >device->sdev_gendev,
+  "SCSI error: return code = 0x%x\n", result);
 
-		if (driver_byte(result) & DRIVER_SENSE)

-   scsi_print_sense("", cmd);
+   if (driver_byte(result) & DRIVER_SENSE)
+   scsi_print_sense("", cmd);
+   }
/*
 * Mark a single buffer as not uptodate.  Queue the remainder.
 * We sometimes get this cruft in the event that a medium error


This patch fixes it, and there was no message during boot about not being 
ready, nor after the machine had fully booted.  Great ;-)


However, I did get an oops when warm booting the kernel, I suspect this may be 
the oops that I get every now and then when warm rebooting, with no real 
pattern, and possibly isn't related to the patch.  As my serial console wasn't 
set up at the time, I took a photo instead, at 
http://www.reub.net/kernel/scsi-oops.jpg


Thanks
reuben

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-mm1: hangs during boot ...

2005-09-05 Thread Reuben Farrelly

Hi,

On 5/09/2005 4:32 a.m., James Bottomley wrote:

On Sun, 2005-09-04 at 01:24 +1200, Reuben Farrelly wrote:
I am seeing it fill up my messages log as it is logging 1 or so messages each 
minute.  I've emailed the SCSI maintainer James Bottomley twice about it but 
had no response either time.


OK, can you try this ... it should confirm the theory if the messages go
away.

Thanks,

James

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -315,7 +315,7 @@ int scsi_execute(struct scsi_device *sde
req-sense = sense;
req-sense_len = 0;
req-timeout = timeout;
-   req-flags |= flags | REQ_BLOCK_PC | REQ_SPECIAL;
+   req-flags |= flags | REQ_BLOCK_PC | REQ_SPECIAL | REQ_QUIET;
 
 	/*

 * head injection *required* here otherwise quiesce won't work
@@ -927,17 +927,20 @@ void scsi_io_completion(struct scsi_cmnd
scsi_requeue_command(q, cmd);
return;
}
-   printk(KERN_INFO Device %s not ready.\n,
-  req-rq_disk ? req-rq_disk-disk_name : );
+   if (!(req-flags  REQ_QUIET))
+   dev_printk(KERN_INFO,
+  cmd-device-sdev_gendev,
+  Device not ready.\n);
cmd = scsi_end_request(cmd, 0, this_count, 1);
return;
case VOLUME_OVERFLOW:
-   printk(KERN_INFO Volume overflow %d %d %d %d CDB: ,
-  cmd-device-host-host_no,
-  (int)cmd-device-channel,
-  (int)cmd-device-id, (int)cmd-device-lun);
-   __scsi_print_command(cmd-data_cmnd);
-   scsi_print_sense(, cmd);
+   if (!(req-flags  REQ_QUIET)) {
+   dev_printk(KERN_INFO,
+  cmd-device-sdev_gendev,
+  Volume overflow, CDB: );
+   __scsi_print_command(cmd-data_cmnd);
+   scsi_print_sense(, cmd);
+   }
cmd = scsi_end_request(cmd, 0, block_bytes, 1);
return;
default:
@@ -954,15 +957,13 @@ void scsi_io_completion(struct scsi_cmnd
return;
}
if (result) {
-   if (!(req-flags  REQ_SPECIAL))
-   printk(KERN_INFO SCSI error : %d %d %d %d return code 

-  = 0x%x\n, cmd-device-host-host_no,
-  cmd-device-channel,
-  cmd-device-id,
-  cmd-device-lun, result);
+   if (!(req-flags  REQ_QUIET)) {
+   dev_printk(KERN_INFO, cmd-device-sdev_gendev,
+  SCSI error: return code = 0x%x\n, result);
 
-		if (driver_byte(result)  DRIVER_SENSE)

-   scsi_print_sense(, cmd);
+   if (driver_byte(result)  DRIVER_SENSE)
+   scsi_print_sense(, cmd);
+   }
/*
 * Mark a single buffer as not uptodate.  Queue the remainder.
 * We sometimes get this cruft in the event that a medium error


This patch fixes it, and there was no message during boot about not being 
ready, nor after the machine had fully booted.  Great ;-)


However, I did get an oops when warm booting the kernel, I suspect this may be 
the oops that I get every now and then when warm rebooting, with no real 
pattern, and possibly isn't related to the patch.  As my serial console wasn't 
set up at the time, I took a photo instead, at 
http://www.reub.net/kernel/scsi-oops.jpg


Thanks
reuben

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-usb-devel] Re: 2.6.13-mm1

2005-09-05 Thread Reuben Farrelly

Hi Alan,

On 3/09/2005 3:19 a.m., Alan Stern wrote:

On Thu, 1 Sep 2005, Andrew Morton wrote:


Reuben Farrelly [EMAIL PROTECTED] wrote:



I'm also observing some USB messages logged:

Sep  2 13:26:22 tornado kernel: usb 5-1: new full speed USB device using 
uhci_hcd and address 13
Sep  2 13:26:22 tornado kernel: drivers/usb/class/usblp.c: usblp0: USB 
Bidirectional printer dev 13 if 0 alt 0 proto 2 vid 0x03F0 pid 0x6204
Sep  2 13:26:23 tornado kernel: hub 5-0:1.0: port 1 disabled by hub (EMI?), 
re-enabling...


This message means pretty much what it says: noise or something else 
caused the connection to be disabled.  In theory this could be caused by a 
problem with the host controller, the cable, or the printer.  Does this 
happen consistently with 2.6.13-mm1?  Did it happen with 2.6.12?


It may have just been a red herring, as I haven't had the problem appear 
since, nor had I seen it before then.  I've done multiple reboots, plug and 
unplugs to test since and all have been OK.


Thanks for taking the time to reply.

reuben
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-mm1: hangs during boot ...

2005-09-03 Thread Reuben Farrelly

Hi Peter,

On 3/09/2005 4:59 a.m., Peter Williams wrote:

Brown, Len wrote:

[  279.662960]  [] wait_for_completion+0xa4/0x110



possibly a missing interrupt?



CONFIG_ACPI=y



any difference if booted with "acpi=off" or "acpi=noirq"?


Yes.  In both cases, the system appears to boot normally but I'm unable 
to login or connect via ssh.  Also there's a "device not ready" message


Are you seeing this "Device  not ready" message appear over and over, or just 
the once?


I am seeing it fill up my messages log as it is logging 1 or so messages each 
minute.  I've emailed the SCSI maintainer James Bottomley twice about it but 
had no response either time.


The SCSI device I have is:

Sep  3 22:14:40 tornado kernel: Vendor: SONY  Model: CD-RW  CRX145S  Rev: 1.0b

As for the inability to log in, this bug may be relevant, given I also had 
that problem:


https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=166422

There are fixes in the pipeline for util-linux audit interaction in Fedora as 
well.  I know because I reported those too ;)


after the scsi initialization which I don't normally see.  I've attached 
the scsi initialization output.  The PF_NETLINK error messages after the 
login prompt in this output are created whenever I try to log in or 
connect via ssh.


The workaround by enabling audit support, but obviously a better fix is in the 
pipeline..


I'm surprised more people aren't discovering these 'interactions' due to 
having audit not turned on.  Does everyone build audit into their kernels?


reuben

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-mm1: hangs during boot ...

2005-09-03 Thread Reuben Farrelly

Hi Peter,

On 3/09/2005 4:59 a.m., Peter Williams wrote:

Brown, Len wrote:

[  279.662960]  [c02d5c74] wait_for_completion+0xa4/0x110



possibly a missing interrupt?



CONFIG_ACPI=y



any difference if booted with acpi=off or acpi=noirq?


Yes.  In both cases, the system appears to boot normally but I'm unable 
to login or connect via ssh.  Also there's a device not ready message


Are you seeing this Device  not ready message appear over and over, or just 
the once?


I am seeing it fill up my messages log as it is logging 1 or so messages each 
minute.  I've emailed the SCSI maintainer James Bottomley twice about it but 
had no response either time.


The SCSI device I have is:

Sep  3 22:14:40 tornado kernel: Vendor: SONY  Model: CD-RW  CRX145S  Rev: 1.0b

As for the inability to log in, this bug may be relevant, given I also had 
that problem:


https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=166422

There are fixes in the pipeline for util-linux audit interaction in Fedora as 
well.  I know because I reported those too ;)


after the scsi initialization which I don't normally see.  I've attached 
the scsi initialization output.  The PF_NETLINK error messages after the 
login prompt in this output are created whenever I try to log in or 
connect via ssh.


The workaround by enabling audit support, but obviously a better fix is in the 
pipeline..


I'm surprised more people aren't discovering these 'interactions' due to 
having audit not turned on.  Does everyone build audit into their kernels?


reuben

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-mm1

2005-09-01 Thread Reuben Farrelly

Hi,

On 1/09/2005 10:58 a.m., Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13/2.6.13-mm1/

- Included Alan's big tty layer buffering rewrite.  This breaks the build on
  lots of more obscure character device drivers.  Patches welcome (please cc
  Alan).



Changes since 2.6.13-rc6-mm2:


 linus.patch
 git-acpi.patch
 git-arm.patch
 git-cpufreq.patch
 git-cryptodev.patch
 git-ia64.patch
 git-audit.patch
 git-audit-ppc64-fix.patch
 git-input.patch
 git-jfs-fixup.patch
 git-kbuild.patch
 git-libata-all.patch
 git-mtd.patch
 git-netdev-all.patch
 git-nfs.patch
 git-ocfs2.patch
 git-serial.patch
 git-scsi-block.patch
 git-scsi-iscsi.patch
 git-scsi-misc.patch
 git-watchdog.patch


This patch:

netlink-log-protocol-failures.patch

is causing lots of messages like this to be logged on my console:

Sep  2 11:52:41 tornado kernel: DEBUG: Failed to load PF_NETLINK protocol 9

It seems to be caused by audit support not being enabled in as if I rebuild 
with audit support the message goes away :)



I'm also observing some USB messages logged:

Sep  2 13:26:22 tornado kernel: usb 5-1: new full speed USB device using 
uhci_hcd and address 13
Sep  2 13:26:22 tornado kernel: drivers/usb/class/usblp.c: usblp0: USB 
Bidirectional printer dev 13 if 0 alt 0 proto 2 vid 0x03F0 pid 0x6204
Sep  2 13:26:23 tornado kernel: hub 5-0:1.0: port 1 disabled by hub (EMI?), 
re-enabling...

Sep  2 13:26:23 tornado kernel: usb 5-1: USB disconnect, address 13
Sep  2 13:26:23 tornado kernel: drivers/usb/class/usblp.c: usblp0: removed
Sep  2 13:26:23 tornado kernel: usb 5-1: new full speed USB device using 
uhci_hcd and address 14

Sep  2 13:26:23 tornado kernel: usb 5-1: device descriptor read/64, error -71
Sep  2 13:26:23 tornado kernel: usb 5-1: device descriptor read/64, error -71
Sep  2 13:26:23 tornado kernel: usb 5-1: new full speed USB device using 
uhci_hcd and address 15

Sep  2 13:26:23 tornado kernel: usb 5-1: device descriptor read/all, error -71
Sep  2 13:26:23 tornado kernel: usb 5-1: new full speed USB device using 
uhci_hcd and address 16

Sep  2 13:26:23 tornado kernel: usb 5-1: can't set config #1, error -71
Sep  2 13:26:23 tornado kernel: usb 5-1: new full speed USB device using 
uhci_hcd and address 17
Sep  2 13:26:24 tornado kernel: usb 5-1: unable to read config index 0 
descriptor/start

Sep  2 13:26:24 tornado kernel: usb 5-1: can't read configurations, error -71

[EMAIL PROTECTED] kernel]# lsusb
Bus 005 Device 004: ID 050d:0105 Belkin Components
Bus 005 Device 003: ID 0451:2046 Texas Instruments, Inc. TUSB2046 Hub
Bus 005 Device 001: ID :
Bus 004 Device 001: ID :
Bus 003 Device 001: ID :
Bus 002 Device 001: ID :
Bus 001 Device 001: ID :
[EMAIL PROTECTED] kernel]#

Output of lsusb -v up at http://www.reub.net/kernel/lsusb-output

reuben


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-mm1

2005-09-01 Thread Reuben Farrelly

Hi,

On 1/09/2005 10:58 a.m., Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13/2.6.13-mm1/

- Included Alan's big tty layer buffering rewrite.  This breaks the build on
  lots of more obscure character device drivers.  Patches welcome (please cc
  Alan).



Changes since 2.6.13-rc6-mm2:


 linus.patch
 git-acpi.patch
 git-arm.patch
 git-cpufreq.patch
 git-cryptodev.patch
 git-ia64.patch
 git-audit.patch
 git-audit-ppc64-fix.patch
 git-input.patch
 git-jfs-fixup.patch
 git-kbuild.patch
 git-libata-all.patch
 git-mtd.patch
 git-netdev-all.patch
 git-nfs.patch
 git-ocfs2.patch
 git-serial.patch
 git-scsi-block.patch
 git-scsi-iscsi.patch
 git-scsi-misc.patch
 git-watchdog.patch


This patch:

netlink-log-protocol-failures.patch

is causing lots of messages like this to be logged on my console:

Sep  2 11:52:41 tornado kernel: DEBUG: Failed to load PF_NETLINK protocol 9

It seems to be caused by audit support not being enabled in as if I rebuild 
with audit support the message goes away :)



I'm also observing some USB messages logged:

Sep  2 13:26:22 tornado kernel: usb 5-1: new full speed USB device using 
uhci_hcd and address 13
Sep  2 13:26:22 tornado kernel: drivers/usb/class/usblp.c: usblp0: USB 
Bidirectional printer dev 13 if 0 alt 0 proto 2 vid 0x03F0 pid 0x6204
Sep  2 13:26:23 tornado kernel: hub 5-0:1.0: port 1 disabled by hub (EMI?), 
re-enabling...

Sep  2 13:26:23 tornado kernel: usb 5-1: USB disconnect, address 13
Sep  2 13:26:23 tornado kernel: drivers/usb/class/usblp.c: usblp0: removed
Sep  2 13:26:23 tornado kernel: usb 5-1: new full speed USB device using 
uhci_hcd and address 14

Sep  2 13:26:23 tornado kernel: usb 5-1: device descriptor read/64, error -71
Sep  2 13:26:23 tornado kernel: usb 5-1: device descriptor read/64, error -71
Sep  2 13:26:23 tornado kernel: usb 5-1: new full speed USB device using 
uhci_hcd and address 15

Sep  2 13:26:23 tornado kernel: usb 5-1: device descriptor read/all, error -71
Sep  2 13:26:23 tornado kernel: usb 5-1: new full speed USB device using 
uhci_hcd and address 16

Sep  2 13:26:23 tornado kernel: usb 5-1: can't set config #1, error -71
Sep  2 13:26:23 tornado kernel: usb 5-1: new full speed USB device using 
uhci_hcd and address 17
Sep  2 13:26:24 tornado kernel: usb 5-1: unable to read config index 0 
descriptor/start

Sep  2 13:26:24 tornado kernel: usb 5-1: can't read configurations, error -71

[EMAIL PROTECTED] kernel]# lsusb
Bus 005 Device 004: ID 050d:0105 Belkin Components
Bus 005 Device 003: ID 0451:2046 Texas Instruments, Inc. TUSB2046 Hub
Bus 005 Device 001: ID :
Bus 004 Device 001: ID :
Bus 003 Device 001: ID :
Bus 002 Device 001: ID :
Bus 001 Device 001: ID :
[EMAIL PROTECTED] kernel]#

Output of lsusb -v up at http://www.reub.net/kernel/lsusb-output

reuben


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Inotify problem [was Re: 2.6.13-rc6-mm1]

2005-08-25 Thread Reuben Farrelly

Hi,

On 22/08/2005 9:10 p.m., John McCutchan wrote:

On Sat, 2005-08-20 at 23:52 -0700, Andrew Morton wrote:

Reuben Farrelly <[EMAIL PROTECTED]> wrote:

Hi,

On 19/08/2005 11:37 a.m., Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc6/2.6.13-rc6-mm1/

- Lots of fixes, updates and cleanups all over the place.

- If you have the right debugging options set, this kernel will generate
  a storm of sleeping-in-atomic-code warnings at boot, from the scsi code.
  It is being worked on.


Changes since 2.6.13-rc5-mm1:

 linus.patch

Noted this in my log earlier today.

Is this inotify related?

Aug 21 08:33:04 tornado kernel: idr_remove called for id=2048 which is not 
allocated.

Aug 21 08:33:04 tornado kernel:  [] dump_stack+0x17/0x19
Aug 21 08:33:04 tornado kernel:  [] idr_remove_warning+0x1b/0x1d
Aug 21 08:33:04 tornado kernel:  [] sub_remove+0x88/0xea
Aug 21 08:33:04 tornado kernel:  [] idr_remove+0x1b/0x7f
Aug 21 08:33:04 tornado kernel:  [] remove_watch_no_event+0x7a/0x12e
Aug 21 08:33:04 tornado kernel:  [] inotify_release+0x8f/0x1af
Aug 21 08:33:04 tornado kernel:  [] __fput+0xaf/0x199
Aug 21 08:33:04 tornado kernel:  [] fput+0x22/0x3b
Aug 21 08:33:04 tornado kernel:  [] filp_close+0x41/0x67
Aug 21 08:33:04 tornado kernel:  [] sys_close+0x70/0x92
Aug 21 08:33:04 tornado kernel:  [] sysenter_past_esp+0x54/0x75
Aug 21 08:33:04 tornado kernel: idr_remove called for id=3072 which is not 
allocated.

Aug 21 08:33:05 tornado kernel:  [] dump_stack+0x17/0x19
Aug 21 08:33:05 tornado kernel:  [] idr_remove_warning+0x1b/0x1d
Aug 21 08:33:05 tornado kernel:  [] sub_remove+0x88/0xea
Aug 21 08:33:05 tornado kernel:  [] idr_remove+0x1b/0x7f
Aug 21 08:33:05 tornado kernel:  [] remove_watch_no_event+0x7a/0x12e
Aug 21 08:33:05 tornado kernel:  [] inotify_release+0x8f/0x1af
Aug 21 08:33:05 tornado kernel:  [] __fput+0xaf/0x199
Aug 21 08:33:05 tornado kernel:  [] fput+0x22/0x3b
Aug 21 08:33:05 tornado kernel:  [] filp_close+0x41/0x67
Aug 21 08:33:05 tornado kernel:  [] sys_close+0x70/0x92
Aug 21 08:33:05 tornado kernel:  [] sysenter_past_esp+0x54/0x75

This would have been triggered by using dovecot IMAP which is configured to 
use inotify on Maildir.

I'm also seeing some userspace errors logged for dovecot:

"Aug 21 04:17:22 Error: IMAP(reuben): inotify_rm_watch() failed: Invalid 
argument"

I'll deal with those with the guy who wrote the inotify code in dovecot.

I'm not so sure userspace should be able or need to cause the kernel to dump 
stack traces like that though?



Yes, the stack dumps would appear to be due to an inotify bug.

The message from dovecot is allegedly due to dovecot passing in a file
descriptor which was not obtained from the inotify_init() syscall.  But
until we know what caused those stack dumps we cannot definitely say
whether dovecot is at fault.



Inotify has a check on both add and rm watch syscalls:

/* verify that this is indeed an inotify instance */
if (unlikely(filp->f_op != _fops)) {
ret = -EINVAL;
goto out;
}

This is crashing in inotify_release, which is called on close of the
inotify instance. So this fd must be from an inotify instance right?

I looked at the dovecot code, it looks fine wrt inotify. Long shot, but
the close-on-exec flag is set. Could this be tripping anything up?


I have also observed another problem with inotify with dovecot - so I spoke 
with Johannes Berg who wrote the inotify code in dovecot.  He suggested I post 
here to LKML since his opinion is that this to be a kernel bug.


The problem I am observing is this, logged by dovecot after a period of time 
when a client is connected:


dovecot: Aug 22 14:31:23 Error: IMAP(gilly): inotify_rm_watch() failed: 
Invalid argument
dovecot: Aug 22 14:31:23 Error: IMAP(gilly): inotify_rm_watch() failed: 
Invalid argument
dovecot: Aug 22 14:31:23 Error: IMAP(gilly): inotify_rm_watch() failed: 
Invalid argument


Multiply that by about 1000 ;-)

Some debugging shows this:
dovecot: Aug 25 19:31:22 Warning: IMAP(gilly): removing wd 1019 from inotify fd 
4
dovecot: Aug 25 19:31:22 Warning: IMAP(gilly): removing wd 1018 from inotify fd 
4
dovecot: Aug 25 19:31:22 Warning: IMAP(gilly): inotify_add_watch returned 1019
dovecot: Aug 25 19:31:22 Warning: IMAP(gilly): inotify_add_watch returned 1020
dovecot: Aug 25 19:31:23 Warning: IMAP(gilly): removing wd 1020 from inotify fd 
4
dovecot: Aug 25 19:31:23 Warning: IMAP(gilly): removing wd 1019 from inotify fd 
4
dovecot: Aug 25 19:31:24 Warning: IMAP(gilly): inotify_add_watch returned 1020
dovecot: Aug 25 19:31:24 Warning: IMAP(gilly): inotify_add_watch returned 1021
dovecot: Aug 25 19:31:24 Warning: IMAP(gilly): removing wd 1021 from inotify fd 
4
dovecot: Aug 25 19:31:24 Warning: IMAP(gilly): removing wd 1020 from inotify fd 
4
dovecot: Aug 25 19:31:25 Warning: IMAP(gilly): inotify_add_watch returned 1021
dovecot: Aug 25 19:31:25 Warning: IMAP(gilly): inotify_add_watch returned 1022
dovec

Inotify problem [was Re: 2.6.13-rc6-mm1]

2005-08-25 Thread Reuben Farrelly

Hi,

On 22/08/2005 9:10 p.m., John McCutchan wrote:

On Sat, 2005-08-20 at 23:52 -0700, Andrew Morton wrote:

Reuben Farrelly [EMAIL PROTECTED] wrote:

Hi,

On 19/08/2005 11:37 a.m., Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc6/2.6.13-rc6-mm1/

- Lots of fixes, updates and cleanups all over the place.

- If you have the right debugging options set, this kernel will generate
  a storm of sleeping-in-atomic-code warnings at boot, from the scsi code.
  It is being worked on.


Changes since 2.6.13-rc5-mm1:

 linus.patch

Noted this in my log earlier today.

Is this inotify related?

Aug 21 08:33:04 tornado kernel: idr_remove called for id=2048 which is not 
allocated.

Aug 21 08:33:04 tornado kernel:  [c0103a00] dump_stack+0x17/0x19
Aug 21 08:33:04 tornado kernel:  [c01c9f9a] idr_remove_warning+0x1b/0x1d
Aug 21 08:33:04 tornado kernel:  [c01ca024] sub_remove+0x88/0xea
Aug 21 08:33:04 tornado kernel:  [c01ca0a1] idr_remove+0x1b/0x7f
Aug 21 08:33:04 tornado kernel:  [c018176a] remove_watch_no_event+0x7a/0x12e
Aug 21 08:33:04 tornado kernel:  [c0181f64] inotify_release+0x8f/0x1af
Aug 21 08:33:04 tornado kernel:  [c015ca80] __fput+0xaf/0x199
Aug 21 08:33:04 tornado kernel:  [c015c9b8] fput+0x22/0x3b
Aug 21 08:33:04 tornado kernel:  [c015b2ed] filp_close+0x41/0x67
Aug 21 08:33:04 tornado kernel:  [c015b383] sys_close+0x70/0x92
Aug 21 08:33:04 tornado kernel:  [c0102a9b] sysenter_past_esp+0x54/0x75
Aug 21 08:33:04 tornado kernel: idr_remove called for id=3072 which is not 
allocated.

Aug 21 08:33:05 tornado kernel:  [c0103a00] dump_stack+0x17/0x19
Aug 21 08:33:05 tornado kernel:  [c01c9f9a] idr_remove_warning+0x1b/0x1d
Aug 21 08:33:05 tornado kernel:  [c01ca024] sub_remove+0x88/0xea
Aug 21 08:33:05 tornado kernel:  [c01ca0a1] idr_remove+0x1b/0x7f
Aug 21 08:33:05 tornado kernel:  [c018176a] remove_watch_no_event+0x7a/0x12e
Aug 21 08:33:05 tornado kernel:  [c0181f64] inotify_release+0x8f/0x1af
Aug 21 08:33:05 tornado kernel:  [c015ca80] __fput+0xaf/0x199
Aug 21 08:33:05 tornado kernel:  [c015c9b8] fput+0x22/0x3b
Aug 21 08:33:05 tornado kernel:  [c015b2ed] filp_close+0x41/0x67
Aug 21 08:33:05 tornado kernel:  [c015b383] sys_close+0x70/0x92
Aug 21 08:33:05 tornado kernel:  [c0102a9b] sysenter_past_esp+0x54/0x75

This would have been triggered by using dovecot IMAP which is configured to 
use inotify on Maildir.

I'm also seeing some userspace errors logged for dovecot:

Aug 21 04:17:22 Error: IMAP(reuben): inotify_rm_watch() failed: Invalid 
argument

I'll deal with those with the guy who wrote the inotify code in dovecot.

I'm not so sure userspace should be able or need to cause the kernel to dump 
stack traces like that though?



Yes, the stack dumps would appear to be due to an inotify bug.

The message from dovecot is allegedly due to dovecot passing in a file
descriptor which was not obtained from the inotify_init() syscall.  But
until we know what caused those stack dumps we cannot definitely say
whether dovecot is at fault.



Inotify has a check on both add and rm watch syscalls:

/* verify that this is indeed an inotify instance */
if (unlikely(filp-f_op != inotify_fops)) {
ret = -EINVAL;
goto out;
}

This is crashing in inotify_release, which is called on close of the
inotify instance. So this fd must be from an inotify instance right?

I looked at the dovecot code, it looks fine wrt inotify. Long shot, but
the close-on-exec flag is set. Could this be tripping anything up?


I have also observed another problem with inotify with dovecot - so I spoke 
with Johannes Berg who wrote the inotify code in dovecot.  He suggested I post 
here to LKML since his opinion is that this to be a kernel bug.


The problem I am observing is this, logged by dovecot after a period of time 
when a client is connected:


dovecot: Aug 22 14:31:23 Error: IMAP(gilly): inotify_rm_watch() failed: 
Invalid argument
dovecot: Aug 22 14:31:23 Error: IMAP(gilly): inotify_rm_watch() failed: 
Invalid argument
dovecot: Aug 22 14:31:23 Error: IMAP(gilly): inotify_rm_watch() failed: 
Invalid argument


Multiply that by about 1000 ;-)

Some debugging shows this:
dovecot: Aug 25 19:31:22 Warning: IMAP(gilly): removing wd 1019 from inotify fd 
4
dovecot: Aug 25 19:31:22 Warning: IMAP(gilly): removing wd 1018 from inotify fd 
4
dovecot: Aug 25 19:31:22 Warning: IMAP(gilly): inotify_add_watch returned 1019
dovecot: Aug 25 19:31:22 Warning: IMAP(gilly): inotify_add_watch returned 1020
dovecot: Aug 25 19:31:23 Warning: IMAP(gilly): removing wd 1020 from inotify fd 
4
dovecot: Aug 25 19:31:23 Warning: IMAP(gilly): removing wd 1019 from inotify fd 
4
dovecot: Aug 25 19:31:24 Warning: IMAP(gilly): inotify_add_watch returned 1020
dovecot: Aug 25 19:31:24 Warning: IMAP(gilly): inotify_add_watch returned 1021
dovecot: Aug 25 19:31:24 Warning: IMAP(gilly): removing wd 1021 from inotify fd 
4
dovecot: Aug 25 19:31:24 Warning: IMAP(gilly): removing wd 1020 from inotify fd 
4

Re: 2.6.13-rc6-mm2

2005-08-23 Thread Reuben Farrelly

Hi,

On 23/08/2005 4:30 p.m., Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc6/2.6.13-rc6-mm2/

- Various updates.  Nothing terribly noteworthy.


Yup, seems to be generally good...

Noticed this in the log earlier tonight:

Aug 23 19:44:51 tornado kernel: hub 5-0:1.0: port 1 disabled by hub (EMI?), 
re-enabling...

Aug 23 19:44:51 tornado kernel: usb 5-1: USB disconnect, address 2
Aug 23 19:44:51 tornado kernel: drivers/usb/class/usblp.c: usblp0: removed
Aug 23 19:44:51 tornado kernel: Unable to handle kernel NULL pointer 
dereference at virtual address 0004

Aug 23 19:44:51 tornado kernel:  printing eip:
Aug 23 19:44:51 tornado kernel: c01ccef2
Aug 23 19:44:51 tornado kernel: *pde = 
Aug 23 19:44:51 tornado kernel: Oops:  [#1]
Aug 23 19:44:51 tornado kernel: SMP
Aug 23 19:44:51 tornado kernel: last sysfs file: 
/devices/pci:00/:00:1f.3/i2c-0/name
Aug 23 19:44:51 tornado kernel: Modules linked in: nfsd exportfs lockd eeprom 
sunrpc ipv6 iptable_filter binfmt_misc reiser4 zlib_de
flate zlib_inflate dm_mod video thermal processor fan button ac tpm_nsc 
i2c_i801 sky2 e100 sr_mod

Aug 23 19:44:51 tornado kernel: CPU:1
Aug 23 19:44:51 tornado kernel: EIP:0060:[]Not tainted VLI
Aug 23 19:44:51 tornado kernel: EFLAGS: 00010286   (2.6.13-rc6-mm2)
Aug 23 19:44:51 tornado kernel: EIP is at _raw_spin_lock+0x7/0x73
Aug 23 19:44:51 tornado kernel: eax:    ebx:    ecx: c1a60658 
  edx: c1a63e24
Aug 23 19:44:51 tornado kernel: esi:    edi: c0382400   ebp: f7c55e98 
  esp: f7c55e90

Aug 23 19:44:51 tornado kernel: ds: 007b   es: 007b   ss: 0068
Aug 23 19:44:51 tornado kernel: Process khubd (pid: 109, threadinfo=f7c54000 
task=c192b030)
Aug 23 19:44:51 tornado kernel: Stack: f7c58a8c  f7c55ea0 c0312219 
f7c55eb0 c030feb7 f7c58ae8 f7c58a48
Aug 23 19:44:51 tornado kernel:f7c55ec4 c0217e73 f7c58a48 f7d134ec 
0040 f7c55ed0 c0217ec0 f7c58a48
Aug 23 19:44:51 tornado kernel:f7c55edc c0217814 f7c58a48 f7c55eec 
c0216ad2 f7c58a48 f7c58a14 f7c55ef8

Aug 23 19:44:51 tornado kernel: Call Trace:
Aug 23 19:44:51 tornado kernel:  [] show_stack+0x94/0xca
Aug 23 19:44:51 tornado kernel:  [] show_registers+0x15a/0x1ea
Aug 23 19:44:51 tornado kernel:  [] die+0x108/0x183
Aug 23 19:44:51 tornado kernel:  [] do_page_fault+0x1ea/0x63d
Aug 23 19:44:51 tornado kernel:  [] error_code+0x4f/0x54
Aug 23 19:44:51 tornado kernel:  [] _spin_lock+0x8/0xa
Aug 23 19:44:51 tornado kernel:  [] klist_remove+0x10/0x2c
Aug 23 19:44:51 tornado kernel:  [] __device_release_driver+0x41/0x65
Aug 23 19:44:51 tornado kernel:  [] device_release_driver+0x29/0x39
Aug 23 19:44:51 tornado kernel:  [] bus_remove_device+0x52/0x60
Aug 23 19:44:51 tornado kernel:  [] device_del+0x2e/0x5d
Aug 23 19:44:51 tornado kernel:  [] device_unregister+0xb/0x15
Aug 23 19:44:51 tornado kernel:  [] usb_disconnect+0x115/0x15c
Aug 23 19:44:51 tornado kernel:  [] hub_port_connect_change+0x54/0x399
Aug 23 19:44:51 tornado kernel:  [] hub_events+0x274/0x3b2
Aug 23 19:44:51 tornado kernel:  [] hub_thread+0x1a/0xdf
Aug 23 19:44:51 tornado kernel:  [] kthread+0x99/0x9d
Aug 23 19:44:51 tornado kernel:  [] kernel_thread_helper+0x5/0xb
Aug 23 19:44:51 tornado kernel: Code: 00 00 00 8b 0d a8 62 36 c0 e9 61 ff ff 
ff f3 90 31 c0 86 07 84 c0 0f 8e 79 ff ff ff 83 c4 18 5
b 5e 5f 5d c3 55 89 e5 56 53 89 c3 <81> 78 04 ad 4e ad de 75 2d be 00 e0 ff ff 
21 e6 8b 06 39 43 0c


reuben

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc6-mm2

2005-08-23 Thread Reuben Farrelly

Hi,

On 23/08/2005 4:30 p.m., Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc6/2.6.13-rc6-mm2/

- Various updates.  Nothing terribly noteworthy.


Yup, seems to be generally good...

Noticed this in the log earlier tonight:

Aug 23 19:44:51 tornado kernel: hub 5-0:1.0: port 1 disabled by hub (EMI?), 
re-enabling...

Aug 23 19:44:51 tornado kernel: usb 5-1: USB disconnect, address 2
Aug 23 19:44:51 tornado kernel: drivers/usb/class/usblp.c: usblp0: removed
Aug 23 19:44:51 tornado kernel: Unable to handle kernel NULL pointer 
dereference at virtual address 0004

Aug 23 19:44:51 tornado kernel:  printing eip:
Aug 23 19:44:51 tornado kernel: c01ccef2
Aug 23 19:44:51 tornado kernel: *pde = 
Aug 23 19:44:51 tornado kernel: Oops:  [#1]
Aug 23 19:44:51 tornado kernel: SMP
Aug 23 19:44:51 tornado kernel: last sysfs file: 
/devices/pci:00/:00:1f.3/i2c-0/name
Aug 23 19:44:51 tornado kernel: Modules linked in: nfsd exportfs lockd eeprom 
sunrpc ipv6 iptable_filter binfmt_misc reiser4 zlib_de
flate zlib_inflate dm_mod video thermal processor fan button ac tpm_nsc 
i2c_i801 sky2 e100 sr_mod

Aug 23 19:44:51 tornado kernel: CPU:1
Aug 23 19:44:51 tornado kernel: EIP:0060:[c01ccef2]Not tainted VLI
Aug 23 19:44:51 tornado kernel: EFLAGS: 00010286   (2.6.13-rc6-mm2)
Aug 23 19:44:51 tornado kernel: EIP is at _raw_spin_lock+0x7/0x73
Aug 23 19:44:51 tornado kernel: eax:    ebx:    ecx: c1a60658 
  edx: c1a63e24
Aug 23 19:44:51 tornado kernel: esi:    edi: c0382400   ebp: f7c55e98 
  esp: f7c55e90

Aug 23 19:44:51 tornado kernel: ds: 007b   es: 007b   ss: 0068
Aug 23 19:44:51 tornado kernel: Process khubd (pid: 109, threadinfo=f7c54000 
task=c192b030)
Aug 23 19:44:51 tornado kernel: Stack: f7c58a8c  f7c55ea0 c0312219 
f7c55eb0 c030feb7 f7c58ae8 f7c58a48
Aug 23 19:44:51 tornado kernel:f7c55ec4 c0217e73 f7c58a48 f7d134ec 
0040 f7c55ed0 c0217ec0 f7c58a48
Aug 23 19:44:51 tornado kernel:f7c55edc c0217814 f7c58a48 f7c55eec 
c0216ad2 f7c58a48 f7c58a14 f7c55ef8

Aug 23 19:44:51 tornado kernel: Call Trace:
Aug 23 19:44:51 tornado kernel:  [c01039c3] show_stack+0x94/0xca
Aug 23 19:44:51 tornado kernel:  [c0103b6c] show_registers+0x15a/0x1ea
Aug 23 19:44:51 tornado kernel:  [c0103d8a] die+0x108/0x183
Aug 23 19:44:51 tornado kernel:  [c031295a] do_page_fault+0x1ea/0x63d
Aug 23 19:44:51 tornado kernel:  [c0103693] error_code+0x4f/0x54
Aug 23 19:44:51 tornado kernel:  [c0312219] _spin_lock+0x8/0xa
Aug 23 19:44:51 tornado kernel:  [c030feb7] klist_remove+0x10/0x2c
Aug 23 19:44:51 tornado kernel:  [c0217e73] __device_release_driver+0x41/0x65
Aug 23 19:44:51 tornado kernel:  [c0217ec0] device_release_driver+0x29/0x39
Aug 23 19:44:51 tornado kernel:  [c0217814] bus_remove_device+0x52/0x60
Aug 23 19:44:51 tornado kernel:  [c0216ad2] device_del+0x2e/0x5d
Aug 23 19:44:51 tornado kernel:  [c0216b0c] device_unregister+0xb/0x15
Aug 23 19:44:51 tornado kernel:  [c0275d67] usb_disconnect+0x115/0x15c
Aug 23 19:44:51 tornado kernel:  [c0276b85] hub_port_connect_change+0x54/0x399
Aug 23 19:44:51 tornado kernel:  [c027713e] hub_events+0x274/0x3b2
Aug 23 19:44:51 tornado kernel:  [c0277296] hub_thread+0x1a/0xdf
Aug 23 19:44:51 tornado kernel:  [c012fba7] kthread+0x99/0x9d
Aug 23 19:44:51 tornado kernel:  [c01010b5] kernel_thread_helper+0x5/0xb
Aug 23 19:44:51 tornado kernel: Code: 00 00 00 8b 0d a8 62 36 c0 e9 61 ff ff 
ff f3 90 31 c0 86 07 84 c0 0f 8e 79 ff ff ff 83 c4 18 5
b 5e 5f 5d c3 55 89 e5 56 53 89 c3 81 78 04 ad 4e ad de 75 2d be 00 e0 ff ff 
21 e6 8b 06 39 43 0c


reuben

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc6-mm1

2005-08-21 Thread Reuben Farrelly

Hi,

On 19/08/2005 11:37 a.m., Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc6/2.6.13-rc6-mm1/

- Lots of fixes, updates and cleanups all over the place.

- If you have the right debugging options set, this kernel will generate
  a storm of sleeping-in-atomic-code warnings at boot, from the scsi code.
  It is being worked on.


Changes since 2.6.13-rc5-mm1:

 linus.patch


Noted this in my log earlier today.

Is this inotify related?

Aug 21 08:33:04 tornado kernel: idr_remove called for id=2048 which is not 
allocated.

Aug 21 08:33:04 tornado kernel:  [] dump_stack+0x17/0x19
Aug 21 08:33:04 tornado kernel:  [] idr_remove_warning+0x1b/0x1d
Aug 21 08:33:04 tornado kernel:  [] sub_remove+0x88/0xea
Aug 21 08:33:04 tornado kernel:  [] idr_remove+0x1b/0x7f
Aug 21 08:33:04 tornado kernel:  [] remove_watch_no_event+0x7a/0x12e
Aug 21 08:33:04 tornado kernel:  [] inotify_release+0x8f/0x1af
Aug 21 08:33:04 tornado kernel:  [] __fput+0xaf/0x199
Aug 21 08:33:04 tornado kernel:  [] fput+0x22/0x3b
Aug 21 08:33:04 tornado kernel:  [] filp_close+0x41/0x67
Aug 21 08:33:04 tornado kernel:  [] sys_close+0x70/0x92
Aug 21 08:33:04 tornado kernel:  [] sysenter_past_esp+0x54/0x75
Aug 21 08:33:04 tornado kernel: idr_remove called for id=3072 which is not 
allocated.

Aug 21 08:33:05 tornado kernel:  [] dump_stack+0x17/0x19
Aug 21 08:33:05 tornado kernel:  [] idr_remove_warning+0x1b/0x1d
Aug 21 08:33:05 tornado kernel:  [] sub_remove+0x88/0xea
Aug 21 08:33:05 tornado kernel:  [] idr_remove+0x1b/0x7f
Aug 21 08:33:05 tornado kernel:  [] remove_watch_no_event+0x7a/0x12e
Aug 21 08:33:05 tornado kernel:  [] inotify_release+0x8f/0x1af
Aug 21 08:33:05 tornado kernel:  [] __fput+0xaf/0x199
Aug 21 08:33:05 tornado kernel:  [] fput+0x22/0x3b
Aug 21 08:33:05 tornado kernel:  [] filp_close+0x41/0x67
Aug 21 08:33:05 tornado kernel:  [] sys_close+0x70/0x92
Aug 21 08:33:05 tornado kernel:  [] sysenter_past_esp+0x54/0x75

This would have been triggered by using dovecot IMAP which is configured to 
use inotify on Maildir.

I'm also seeing some userspace errors logged for dovecot:

"Aug 21 04:17:22 Error: IMAP(reuben): inotify_rm_watch() failed: Invalid 
argument"

I'll deal with those with the guy who wrote the inotify code in dovecot.

I'm not so sure userspace should be able or need to cause the kernel to dump 
stack traces like that though?


reuben

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc6-mm1

2005-08-21 Thread Reuben Farrelly

Hi,

On 21/08/2005 1:40 a.m., David Woodhouse wrote:

On Fri, 2005-08-19 at 18:36 -0700, Andrew Morton wrote:

Reuben Farrelly <[EMAIL PROTECTED]> wrote:

...

4. PAM is complaining about "PAM audit_open() failed: Protocol not suppor
ted" and I can't log in as any user including root.  I would have picked this 
was a userspace problem, but it doesn't break with -rc5-mm1, yet reproduceably 
breaks with -rc6-mm1.  Weird.

hm.  How come you're able to use the machine then?
Machine was booting up ok, and things were being written to syslog.  Rebooted 
into -rc5-mm1 to investigate, and of course could boot into rc6-mm1 in single 
user mode, test and bring services up one by one from there.  Having two boxes 
helped too.



Is it possible to get an strace of this failure somehow?
Not sure if this is needed anymore, as I found that the problem goes away when 
I compile in kernel auditing.  This not required for -rc5-mm1.  Is that change 
intended?



Sounds wrong to me, especially if 2.6.13-rc6 doesn't do that.


Hm. It sounds like you'd configured PAM to require the pam_loginuid
module even though you didn't have auditing enabled in your kernel. That
seems strange and wrong to me, and _is_ a userspace problem.


I haven't touched my pam config since it was installed a long time ago - it's 
one of those things that is too annoying to fix once broked, so I leave it 
alone at the system defaults ;)


I had logged this as a Fedora bug as I figured the pam_loginuid
detection of the presence of auditing in the kernel is not very robust.  There 
was a patch modified in pam-0.80-6 at the start of August which was to fix 
this on non audit enabled kernels, which works for anything up to and older 
than 2.6.12-rc5-mm1.


https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=166422

It was closed 8 mins later, and the suggestion made that I take it to a pam 
development list instead.  Redhat don't seem so interested in fixing things as 
a result of breakage when running an -mm kernel.



I'd also agree that it shouldn't have changed with the new kernel though
-- and I can't think of anything I changed recently which would have
that effect. An strace would still be useful.


Done.  Posted up at  http://www.reub.net/kernel/strace-login


Can you double-check that you didn't have auditing enabled in your
older, working kernel?


Definitely wasn't enabled.  I still have the .config that I used to build
-rc5-mm1 with and my original -rc6-mm1 and it reads:

CONFIG_SYSCTL=y
# CONFIG_AUDIT is not set
CONFIG_HOTPLUG=y

Thanks for taking a look.

Reuben




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc6-mm1

2005-08-21 Thread Reuben Farrelly

Hi,

On 21/08/2005 1:40 a.m., David Woodhouse wrote:

On Fri, 2005-08-19 at 18:36 -0700, Andrew Morton wrote:

Reuben Farrelly [EMAIL PROTECTED] wrote:

...

4. PAM is complaining about PAM audit_open() failed: Protocol not suppor
ted and I can't log in as any user including root.  I would have picked this 
was a userspace problem, but it doesn't break with -rc5-mm1, yet reproduceably 
breaks with -rc6-mm1.  Weird.

hm.  How come you're able to use the machine then?
Machine was booting up ok, and things were being written to syslog.  Rebooted 
into -rc5-mm1 to investigate, and of course could boot into rc6-mm1 in single 
user mode, test and bring services up one by one from there.  Having two boxes 
helped too.



Is it possible to get an strace of this failure somehow?
Not sure if this is needed anymore, as I found that the problem goes away when 
I compile in kernel auditing.  This not required for -rc5-mm1.  Is that change 
intended?



Sounds wrong to me, especially if 2.6.13-rc6 doesn't do that.


Hm. It sounds like you'd configured PAM to require the pam_loginuid
module even though you didn't have auditing enabled in your kernel. That
seems strange and wrong to me, and _is_ a userspace problem.


I haven't touched my pam config since it was installed a long time ago - it's 
one of those things that is too annoying to fix once broked, so I leave it 
alone at the system defaults ;)


I had logged this as a Fedora bug as I figured the pam_loginuid
detection of the presence of auditing in the kernel is not very robust.  There 
was a patch modified in pam-0.80-6 at the start of August which was to fix 
this on non audit enabled kernels, which works for anything up to and older 
than 2.6.12-rc5-mm1.


https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=166422

It was closed 8 mins later, and the suggestion made that I take it to a pam 
development list instead.  Redhat don't seem so interested in fixing things as 
a result of breakage when running an -mm kernel.



I'd also agree that it shouldn't have changed with the new kernel though
-- and I can't think of anything I changed recently which would have
that effect. An strace would still be useful.


Done.  Posted up at  http://www.reub.net/kernel/strace-login


Can you double-check that you didn't have auditing enabled in your
older, working kernel?


Definitely wasn't enabled.  I still have the .config that I used to build
-rc5-mm1 with and my original -rc6-mm1 and it reads:

CONFIG_SYSCTL=y
# CONFIG_AUDIT is not set
CONFIG_HOTPLUG=y

Thanks for taking a look.

Reuben




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc6-mm1

2005-08-21 Thread Reuben Farrelly

Hi,

On 19/08/2005 11:37 a.m., Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc6/2.6.13-rc6-mm1/

- Lots of fixes, updates and cleanups all over the place.

- If you have the right debugging options set, this kernel will generate
  a storm of sleeping-in-atomic-code warnings at boot, from the scsi code.
  It is being worked on.


Changes since 2.6.13-rc5-mm1:

 linus.patch


Noted this in my log earlier today.

Is this inotify related?

Aug 21 08:33:04 tornado kernel: idr_remove called for id=2048 which is not 
allocated.

Aug 21 08:33:04 tornado kernel:  [c0103a00] dump_stack+0x17/0x19
Aug 21 08:33:04 tornado kernel:  [c01c9f9a] idr_remove_warning+0x1b/0x1d
Aug 21 08:33:04 tornado kernel:  [c01ca024] sub_remove+0x88/0xea
Aug 21 08:33:04 tornado kernel:  [c01ca0a1] idr_remove+0x1b/0x7f
Aug 21 08:33:04 tornado kernel:  [c018176a] remove_watch_no_event+0x7a/0x12e
Aug 21 08:33:04 tornado kernel:  [c0181f64] inotify_release+0x8f/0x1af
Aug 21 08:33:04 tornado kernel:  [c015ca80] __fput+0xaf/0x199
Aug 21 08:33:04 tornado kernel:  [c015c9b8] fput+0x22/0x3b
Aug 21 08:33:04 tornado kernel:  [c015b2ed] filp_close+0x41/0x67
Aug 21 08:33:04 tornado kernel:  [c015b383] sys_close+0x70/0x92
Aug 21 08:33:04 tornado kernel:  [c0102a9b] sysenter_past_esp+0x54/0x75
Aug 21 08:33:04 tornado kernel: idr_remove called for id=3072 which is not 
allocated.

Aug 21 08:33:05 tornado kernel:  [c0103a00] dump_stack+0x17/0x19
Aug 21 08:33:05 tornado kernel:  [c01c9f9a] idr_remove_warning+0x1b/0x1d
Aug 21 08:33:05 tornado kernel:  [c01ca024] sub_remove+0x88/0xea
Aug 21 08:33:05 tornado kernel:  [c01ca0a1] idr_remove+0x1b/0x7f
Aug 21 08:33:05 tornado kernel:  [c018176a] remove_watch_no_event+0x7a/0x12e
Aug 21 08:33:05 tornado kernel:  [c0181f64] inotify_release+0x8f/0x1af
Aug 21 08:33:05 tornado kernel:  [c015ca80] __fput+0xaf/0x199
Aug 21 08:33:05 tornado kernel:  [c015c9b8] fput+0x22/0x3b
Aug 21 08:33:05 tornado kernel:  [c015b2ed] filp_close+0x41/0x67
Aug 21 08:33:05 tornado kernel:  [c015b383] sys_close+0x70/0x92
Aug 21 08:33:05 tornado kernel:  [c0102a9b] sysenter_past_esp+0x54/0x75

This would have been triggered by using dovecot IMAP which is configured to 
use inotify on Maildir.

I'm also seeing some userspace errors logged for dovecot:

Aug 21 04:17:22 Error: IMAP(reuben): inotify_rm_watch() failed: Invalid 
argument

I'll deal with those with the guy who wrote the inotify code in dovecot.

I'm not so sure userspace should be able or need to cause the kernel to dump 
stack traces like that though?


reuben

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc6-mm1

2005-08-19 Thread Reuben Farrelly

Hi again,

On 20/08/2005 5:34 a.m., Andrew Morton wrote:

Reuben Farrelly <[EMAIL PROTECTED]> wrote:



A few new problems cropped up with this kernel..

1. NFS seems to be unstable, oopsing when shutting down:


--- devel/fs/nfsd/nfssvc.c~ingo-nfs-stuff-fix   2005-08-19 10:29:15.0 
-0700
+++ devel-akpm/fs/nfsd/nfssvc.c 2005-08-19 10:30:03.0 -0700
@@ -286,7 +286,6 @@ out:
/* Release the thread */
svc_exit_thread(rqstp);
 
-	unlock_kernel();

/* Release module */
unlock_kernel();
module_put_and_exit(0);
_


That fixed it, thanks.



Aug 20 12:26:10 tornado kernel: Device  not ready.

2.  That message on the third line of the trace above: "kernel: Device  not 
ready." is being logged every few mins or so, I believe it is my SCSI CDROM 
that is causing it.  It also logs something similar after the SCSI driver has 
probed the device on boot:


Aug 20 12:24:36 tornado kernel: scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA 
DRIVER, Rev 7.0

Aug 20 12:24:36 tornado kernel: 
Aug 20 12:24:36 tornado kernel: aic7880: Ultra Wide Channel A, SCSI 
Id=7, 16/253 SCBs

Aug 20 12:24:36 tornado kernel:
Aug 20 12:24:36 tornado kernel:   Vendor: SONY  Model: CD-RW  CRX145S 
Rev: 1.0b
Aug 20 12:24:36 tornado kernel:   Type:   CD-ROM 
ANSI SCSI revision: 04

Aug 20 12:24:36 tornado kernel:  target0:0:6: Beginning Domain Validation
Aug 20 12:24:36 tornado kernel:  target0:0:6: Domain Validation skipping write 
tests
Aug 20 12:24:36 tornado kernel:  target0:0:6: FAST-10 SCSI 10.0 MB/s ST (100 
ns, offset 15)

Aug 20 12:24:36 tornado kernel:  target0:0:6: Ending Domain Validation
Aug 20 12:24:36 tornado kernel: Device  not ready.

This has been a problem for quite a few weeks now, albeit I believe, only a 
cosmetic one.


Is some application trying to poll the device?


I wonder if hald knows something about this and is polling.. however that 
message above about "Device  not ready" occurs when the kernel is booting, 
before any userspace stuff has started up.  Maybe hald is just being a bit 
aggressive in re-probing the drive after userspace launches.  B all accounts 
after a week of uptime the drive certainly ought to be ready, it seems to work 
ok ;-)


Note the extra space after 'Device' and 'not' which implies possibly some text 
is missing (which would have made it more clear which device is not exactly 
ready).  The case sensitive strings "Device" and "not ready" appears together 
in scsi_lib.c and very few other places.


Is the device actually "not ready", or is it in reality ready and working? 
ie: what happens if you stick a CD in it?


The CD can be read, and the error messages go away.  They stay away even after 
the CD has been ejected.



4. PAM is complaining about "PAM audit_open() failed: Protocol not suppor
ted" and I can't log in as any user including root.  I would have picked this 
was a userspace problem, but it doesn't break with -rc5-mm1, yet reproduceably 
breaks with -rc6-mm1.  Weird.


hm.  How come you're able to use the machine then?


Machine was booting up ok, and things were being written to syslog.  Rebooted 
into -rc5-mm1 to investigate, and of course could boot into rc6-mm1 in single 
user mode, test and bring services up one by one from there.  Having two boxes 
helped too.



Is it possible to get an strace of this failure somehow?


Not sure if this is needed anymore, as I found that the problem goes away when 
I compile in kernel auditing.  This not required for -rc5-mm1.  Is that change 
intended?


reuben
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc6-mm1

2005-08-19 Thread Reuben Farrelly

Hi,

On 19/08/2005 11:33 p.m., Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc6/2.6.13-rc6-mm1/

- Lots of fixes, updates and cleanups all over the place.

- If you have the right debugging options set, this kernel will generate
  a storm of sleeping-in-atomic-code warnings at boot, from the scsi code.
  It is being worked on.


A few new problems cropped up with this kernel..

1. NFS seems to be unstable, oopsing when shutting down:

Aug 20 12:26:09 tornado shutdown: shutting down for system reboot
Aug 20 12:26:10 tornado init: Switching to runlevel: 6
Aug 20 12:26:10 tornado kernel: Device  not ready.
Aug 20 12:26:10 tornado last message repeated 4 times
Aug 20 12:26:11 tornado smokeping[2524]: Got TERM signal, terminating.
Aug 20 12:26:16 tornado rpc.mountd: Caught signal 15, un-registering and 
exiting.
Aug 20 12:26:20 tornado kernel: [ cut here ]
Aug 20 12:26:20 tornado kernel: kernel BUG at lib/kernel_lock.c:83!
Aug 20 12:26:20 tornado kernel: invalid operand:  [#1]
Aug 20 12:26:20 tornado kernel: SMP
Aug 20 12:26:20 tornado kernel: last sysfs file: 
/devices/pci:00/:00:1d.3/usb5/5-2/5-2.2/5-2.2.1/5-2.2.1.1/5-2.2.1.1:1.1/mod

alias
Aug 20 12:26:20 tornado kernel: Modules linked in: nfsd exportfs lockd eeprom 
sunrpc ipv6 iptable_filter binfmt_misc reiser4 zlib_de
flate zlib_inflate dm_mod video thermal processor fan button ac i8xx_tco 
i2c_i801 sky2 sr_mod

Aug 20 12:26:20 tornado kernel: CPU:1
Aug 20 12:26:20 tornado kernel: EIP:0060:[]Not tainted VLI
Aug 20 12:26:20 tornado kernel: EFLAGS: 00010286   (2.6.13-rc6-mm1)
Aug 20 12:26:20 tornado kernel: EIP is at unlock_kernel+0x28/0x32
Aug 20 12:26:20 tornado kernel: eax:    ebx: 0009   ecx: f6a23f90 
  edx: f6adaa50
Aug 20 12:26:20 tornado kernel: esi: f6a23f54   edi: c191d2fc   ebp: f6b3ffa8 
  esp: f6b3ffa8

Aug 20 12:26:20 tornado kernel: ds: 007b   es: 007b   ss: 0068
Aug 20 12:26:20 tornado kernel: Process nfsd (pid: 2034, threadinfo=f6b3e000 
task=f6adaa50)
Aug 20 12:26:20 tornado kernel: Stack: f6b3ffe4 f8e0e4c2 f8e2d648 f6b3e000 
f6f9103c 00100100 00200200 f6adaa50
Aug 20 12:26:20 tornado kernel:feff  fef8  
f8e0e231   
Aug 20 12:26:20 tornado kernel:c01010b5 f6f9103c   
5a5a5a5a a55a5a5a

Aug 20 12:26:20 tornado kernel: Call Trace:
Aug 20 12:26:20 tornado kernel:  [] show_stack+0x94/0xca
Aug 20 12:26:20 tornado kernel:  [] show_registers+0x15a/0x1ea
Aug 20 12:26:20 tornado kernel:  [] die+0x108/0x183
Aug 20 12:26:20 tornado kernel:  [] do_trap+0x76/0xa1
Aug 20 12:26:20 tornado kernel:  [] do_invalid_op+0x97/0xa1
Aug 20 12:26:20 tornado kernel:  [] error_code+0x4f/0x54
Aug 20 12:26:20 tornado kernel:  [] nfsd+0x291/0x341 [nfsd]
Aug 20 12:26:20 tornado kernel:  [] kernel_thread_helper+0x5/0xb
Aug 20 12:26:20 tornado kernel: Code: 5e 5d c3 55 89 e5 b8 00 e0 ff ff 21 e0 
8b 10 8b 42 14 85 c0 78 15 83 e8 01 89 42 14 85 c0 79 0
9 f0 ff 05 40 e7 36 c0 7e 39 5d c3 <0f> 0b 53 00 37 e1 32 c0 eb e1 8d 05 40 e7 
36 c0 e8 fe dd ff ff

Aug 20 12:26:20 tornado kernel:  [ cut here ]


2.  That message on the third line of the trace above: "kernel: Device  not 
ready." is being logged every few mins or so, I believe it is my SCSI CDROM 
that is causing it.  It also logs something similar after the SCSI driver has 
probed the device on boot:


Aug 20 12:24:36 tornado kernel: scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA 
DRIVER, Rev 7.0

Aug 20 12:24:36 tornado kernel: 
Aug 20 12:24:36 tornado kernel: aic7880: Ultra Wide Channel A, SCSI 
Id=7, 16/253 SCBs

Aug 20 12:24:36 tornado kernel:
Aug 20 12:24:36 tornado kernel:   Vendor: SONY  Model: CD-RW  CRX145S 
Rev: 1.0b
Aug 20 12:24:36 tornado kernel:   Type:   CD-ROM 
ANSI SCSI revision: 04

Aug 20 12:24:36 tornado kernel:  target0:0:6: Beginning Domain Validation
Aug 20 12:24:36 tornado kernel:  target0:0:6: Domain Validation skipping write 
tests
Aug 20 12:24:36 tornado kernel:  target0:0:6: FAST-10 SCSI 10.0 MB/s ST (100 
ns, offset 15)

Aug 20 12:24:36 tornado kernel:  target0:0:6: Ending Domain Validation
Aug 20 12:24:36 tornado kernel: Device  not ready.

This has been a problem for quite a few weeks now, albeit I believe, only a 
cosmetic one.


3. As I have a Marvell Yukon 2 chipset, I was _delighted_ to see a new driver 
from Stephen Hemmingway appear in the netdev tree for it.  However it seems to 
be a bit broken, I get link up and a bit of traffic before it just stops 
passing traffic of any sort and requires an rmmod/modprobe to get going again. 
 I've emailed him directly about this.


4. PAM is complaining about "PAM audit_open() failed: Protocol not suppor
ted" and I can't log in as any user including root.  I would have picked this 
was a userspace problem, but it doesn't break with -rc5-mm1, yet reproduceably 
breaks with -rc6-mm1.  Weird.


reuben

-
To unsubscribe 

Re: 2.6.13-rc6-mm1

2005-08-19 Thread Reuben Farrelly

Hi,

On 19/08/2005 11:33 p.m., Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc6/2.6.13-rc6-mm1/

- Lots of fixes, updates and cleanups all over the place.

- If you have the right debugging options set, this kernel will generate
  a storm of sleeping-in-atomic-code warnings at boot, from the scsi code.
  It is being worked on.


A few new problems cropped up with this kernel..

1. NFS seems to be unstable, oopsing when shutting down:

Aug 20 12:26:09 tornado shutdown: shutting down for system reboot
Aug 20 12:26:10 tornado init: Switching to runlevel: 6
Aug 20 12:26:10 tornado kernel: Device  not ready.
Aug 20 12:26:10 tornado last message repeated 4 times
Aug 20 12:26:11 tornado smokeping[2524]: Got TERM signal, terminating.
Aug 20 12:26:16 tornado rpc.mountd: Caught signal 15, un-registering and 
exiting.
Aug 20 12:26:20 tornado kernel: [ cut here ]
Aug 20 12:26:20 tornado kernel: kernel BUG at lib/kernel_lock.c:83!
Aug 20 12:26:20 tornado kernel: invalid operand:  [#1]
Aug 20 12:26:20 tornado kernel: SMP
Aug 20 12:26:20 tornado kernel: last sysfs file: 
/devices/pci:00/:00:1d.3/usb5/5-2/5-2.2/5-2.2.1/5-2.2.1.1/5-2.2.1.1:1.1/mod

alias
Aug 20 12:26:20 tornado kernel: Modules linked in: nfsd exportfs lockd eeprom 
sunrpc ipv6 iptable_filter binfmt_misc reiser4 zlib_de
flate zlib_inflate dm_mod video thermal processor fan button ac i8xx_tco 
i2c_i801 sky2 sr_mod

Aug 20 12:26:20 tornado kernel: CPU:1
Aug 20 12:26:20 tornado kernel: EIP:0060:[c0310845]Not tainted VLI
Aug 20 12:26:20 tornado kernel: EFLAGS: 00010286   (2.6.13-rc6-mm1)
Aug 20 12:26:20 tornado kernel: EIP is at unlock_kernel+0x28/0x32
Aug 20 12:26:20 tornado kernel: eax:    ebx: 0009   ecx: f6a23f90 
  edx: f6adaa50
Aug 20 12:26:20 tornado kernel: esi: f6a23f54   edi: c191d2fc   ebp: f6b3ffa8 
  esp: f6b3ffa8

Aug 20 12:26:20 tornado kernel: ds: 007b   es: 007b   ss: 0068
Aug 20 12:26:20 tornado kernel: Process nfsd (pid: 2034, threadinfo=f6b3e000 
task=f6adaa50)
Aug 20 12:26:20 tornado kernel: Stack: f6b3ffe4 f8e0e4c2 f8e2d648 f6b3e000 
f6f9103c 00100100 00200200 f6adaa50
Aug 20 12:26:20 tornado kernel:feff  fef8  
f8e0e231   
Aug 20 12:26:20 tornado kernel:c01010b5 f6f9103c   
5a5a5a5a a55a5a5a

Aug 20 12:26:20 tornado kernel: Call Trace:
Aug 20 12:26:20 tornado kernel:  [c01039c3] show_stack+0x94/0xca
Aug 20 12:26:20 tornado kernel:  [c0103b6c] show_registers+0x15a/0x1ea
Aug 20 12:26:20 tornado kernel:  [c0103d8a] die+0x108/0x183
Aug 20 12:26:20 tornado kernel:  [c0310986] do_trap+0x76/0xa1
Aug 20 12:26:20 tornado kernel:  [c0104090] do_invalid_op+0x97/0xa1
Aug 20 12:26:20 tornado kernel:  [c0103693] error_code+0x4f/0x54
Aug 20 12:26:20 tornado kernel:  [f8e0e4c2] nfsd+0x291/0x341 [nfsd]
Aug 20 12:26:20 tornado kernel:  [c01010b5] kernel_thread_helper+0x5/0xb
Aug 20 12:26:20 tornado kernel: Code: 5e 5d c3 55 89 e5 b8 00 e0 ff ff 21 e0 
8b 10 8b 42 14 85 c0 78 15 83 e8 01 89 42 14 85 c0 79 0
9 f0 ff 05 40 e7 36 c0 7e 39 5d c3 0f 0b 53 00 37 e1 32 c0 eb e1 8d 05 40 e7 
36 c0 e8 fe dd ff ff

Aug 20 12:26:20 tornado kernel:  [ cut here ]


2.  That message on the third line of the trace above: kernel: Device  not 
ready. is being logged every few mins or so, I believe it is my SCSI CDROM 
that is causing it.  It also logs something similar after the SCSI driver has 
probed the device on boot:


Aug 20 12:24:36 tornado kernel: scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA 
DRIVER, Rev 7.0

Aug 20 12:24:36 tornado kernel: Adaptec 2940 Ultra SCSI adapter
Aug 20 12:24:36 tornado kernel: aic7880: Ultra Wide Channel A, SCSI 
Id=7, 16/253 SCBs

Aug 20 12:24:36 tornado kernel:
Aug 20 12:24:36 tornado kernel:   Vendor: SONY  Model: CD-RW  CRX145S 
Rev: 1.0b
Aug 20 12:24:36 tornado kernel:   Type:   CD-ROM 
ANSI SCSI revision: 04

Aug 20 12:24:36 tornado kernel:  target0:0:6: Beginning Domain Validation
Aug 20 12:24:36 tornado kernel:  target0:0:6: Domain Validation skipping write 
tests
Aug 20 12:24:36 tornado kernel:  target0:0:6: FAST-10 SCSI 10.0 MB/s ST (100 
ns, offset 15)

Aug 20 12:24:36 tornado kernel:  target0:0:6: Ending Domain Validation
Aug 20 12:24:36 tornado kernel: Device  not ready.

This has been a problem for quite a few weeks now, albeit I believe, only a 
cosmetic one.


3. As I have a Marvell Yukon 2 chipset, I was _delighted_ to see a new driver 
from Stephen Hemmingway appear in the netdev tree for it.  However it seems to 
be a bit broken, I get link up and a bit of traffic before it just stops 
passing traffic of any sort and requires an rmmod/modprobe to get going again. 
 I've emailed him directly about this.


4. PAM is complaining about PAM audit_open() failed: Protocol not suppor
ted and I can't log in as any user including root.  I would have picked this 
was a userspace problem, but it doesn't 

Re: 2.6.13-rc6-mm1

2005-08-19 Thread Reuben Farrelly

Hi again,

On 20/08/2005 5:34 a.m., Andrew Morton wrote:

Reuben Farrelly [EMAIL PROTECTED] wrote:



A few new problems cropped up with this kernel..

1. NFS seems to be unstable, oopsing when shutting down:


--- devel/fs/nfsd/nfssvc.c~ingo-nfs-stuff-fix   2005-08-19 10:29:15.0 
-0700
+++ devel-akpm/fs/nfsd/nfssvc.c 2005-08-19 10:30:03.0 -0700
@@ -286,7 +286,6 @@ out:
/* Release the thread */
svc_exit_thread(rqstp);
 
-	unlock_kernel();

/* Release module */
unlock_kernel();
module_put_and_exit(0);
_


That fixed it, thanks.



Aug 20 12:26:10 tornado kernel: Device  not ready.

2.  That message on the third line of the trace above: kernel: Device  not 
ready. is being logged every few mins or so, I believe it is my SCSI CDROM 
that is causing it.  It also logs something similar after the SCSI driver has 
probed the device on boot:


Aug 20 12:24:36 tornado kernel: scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA 
DRIVER, Rev 7.0

Aug 20 12:24:36 tornado kernel: Adaptec 2940 Ultra SCSI adapter
Aug 20 12:24:36 tornado kernel: aic7880: Ultra Wide Channel A, SCSI 
Id=7, 16/253 SCBs

Aug 20 12:24:36 tornado kernel:
Aug 20 12:24:36 tornado kernel:   Vendor: SONY  Model: CD-RW  CRX145S 
Rev: 1.0b
Aug 20 12:24:36 tornado kernel:   Type:   CD-ROM 
ANSI SCSI revision: 04

Aug 20 12:24:36 tornado kernel:  target0:0:6: Beginning Domain Validation
Aug 20 12:24:36 tornado kernel:  target0:0:6: Domain Validation skipping write 
tests
Aug 20 12:24:36 tornado kernel:  target0:0:6: FAST-10 SCSI 10.0 MB/s ST (100 
ns, offset 15)

Aug 20 12:24:36 tornado kernel:  target0:0:6: Ending Domain Validation
Aug 20 12:24:36 tornado kernel: Device  not ready.

This has been a problem for quite a few weeks now, albeit I believe, only a 
cosmetic one.


Is some application trying to poll the device?


I wonder if hald knows something about this and is polling.. however that 
message above about Device  not ready occurs when the kernel is booting, 
before any userspace stuff has started up.  Maybe hald is just being a bit 
aggressive in re-probing the drive after userspace launches.  B all accounts 
after a week of uptime the drive certainly ought to be ready, it seems to work 
ok ;-)


Note the extra space after 'Device' and 'not' which implies possibly some text 
is missing (which would have made it more clear which device is not exactly 
ready).  The case sensitive strings Device and not ready appears together 
in scsi_lib.c and very few other places.


Is the device actually not ready, or is it in reality ready and working? 
ie: what happens if you stick a CD in it?


The CD can be read, and the error messages go away.  They stay away even after 
the CD has been ejected.



4. PAM is complaining about PAM audit_open() failed: Protocol not suppor
ted and I can't log in as any user including root.  I would have picked this 
was a userspace problem, but it doesn't break with -rc5-mm1, yet reproduceably 
breaks with -rc6-mm1.  Weird.


hm.  How come you're able to use the machine then?


Machine was booting up ok, and things were being written to syslog.  Rebooted 
into -rc5-mm1 to investigate, and of course could boot into rc6-mm1 in single 
user mode, test and bring services up one by one from there.  Having two boxes 
helped too.



Is it possible to get an strace of this failure somehow?


Not sure if this is needed anymore, as I found that the problem goes away when 
I compile in kernel auditing.  This not required for -rc5-mm1.  Is that change 
intended?


reuben
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc3-mm2

2005-07-28 Thread Reuben Farrelly



On 28/07/2005 9:10 p.m., Andrew Morton wrote:

Reuben Farrelly <[EMAIL PROTECTED]> wrote:

On 27/07/2005 9:45 a.m., Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm2/


- Lots of fixes and updates all over the place.  There are probably over 100
  patches here which need to go into 2.6.13.

- A reminder that -mm commit activity may be monitored by subscribing to
  the mm-commits list.  Do

echo subscribe mm-commits | mail [EMAIL PROTECTED]


Also seeing this during boot-up:


This was happening in earlier -mm's was it not?


Hadn't seen it anytime recently..


last sysfs file:


grr, I need to fix that.


  [] show_stack+0x94/0xca
  [] show_registers+0x165/0x1f9
  [] die+0x108/0x183
  [] do_page_fault+0x1ea/0x63d
  [] error_code+0x4f/0x54
  [] fill_read_buffer+0x2e/0x74
  [] sysfs_read_file+0x46/0x76


some dud sysfs file.


Didn't appear after a reboot of 2.6.13-rc3-mm2, and doesn't appear with 
2.6.13-rc3-mm3, so not too sure what to make of it now.  Will see if it 
reappears (box is otherwise stable).


Thanks,
reuben

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc3-mm2

2005-07-28 Thread Reuben Farrelly

On 27/07/2005 9:45 a.m., Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm2/


- Lots of fixes and updates all over the place.  There are probably over 100
  patches here which need to go into 2.6.13.

- A reminder that -mm commit activity may be monitored by subscribing to
  the mm-commits list.  Do

echo subscribe mm-commits | mail [EMAIL PROTECTED]



Also seeing this during boot-up:

Adding 497972k swap on /dev/sda7.  Priority:1 extents:1 across:497972k
Adding 497972k swap on /dev/sdb7.  Priority:1 extents:1 across:497972k
Unable to handle kernel paging request at virtual address 00316173
 printing eip:
00316173
*pde = 
Oops:  [#1]
SMP
last sysfs file:
Modules linked in: binfmt_misc reiser4 zlib_deflate zlib_inflate dm_mod video 
thermal processor hotkey fan button ac i8xx_tco i2c_i8

01
CPU:0
EIP:0060:[<00316173>]Not tainted VLI
EFLAGS: 00010202   (2.6.13-rc3-mm2)
EIP is at 0x316173
eax: dfc05d24   ebx: dfc05d24   ecx: 00316173   edx: de87
esi: de87   edi: dfc05d2c   ebp: df4e5f3c   esp: df4e5f30
ds: 007b   es: 007b   ss: 0068
Process udev (pid: 1141, threadinfo=df4e4000 task=df24ea50)
Stack: c02135a7 dfc6f0e8 c037edf4 df4e5f54 c018b5c3 de5d2bec dfc6f0e8 dfc8b1ec
   1000 df4e5f74 c018b6fe df989030 080659b0 dfc6f0fc dfc8b1ec 1000
   c018b6b8 df4e5f94 c0157c8f df4e5fa0 080659b0  dfc8b1ec fff7
Call Trace:
 [] show_stack+0x94/0xca
 [] show_registers+0x165/0x1f9
 [] die+0x108/0x183
 [] do_page_fault+0x1ea/0x63d
 [] error_code+0x4f/0x54
 [] fill_read_buffer+0x2e/0x74
 [] sysfs_read_file+0x46/0x76
 [] vfs_read+0x8a/0x146
 [] sys_read+0x3d/0x64
 [] sysenter_past_esp+0x54/0x75
Code:  Bad EIP value.
 <6>NET: Registered protocol family 10
IPv6 over IPv4 tunneling driver

The machine continues on booting..

reuben

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc3-mm2

2005-07-28 Thread Reuben Farrelly

On 27/07/2005 9:45 a.m., Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm2/


- Lots of fixes and updates all over the place.  There are probably over 100
  patches here which need to go into 2.6.13.

- A reminder that -mm commit activity may be monitored by subscribing to
  the mm-commits list.  Do

echo subscribe mm-commits | mail [EMAIL PROTECTED]



Also seeing this during boot-up:

Adding 497972k swap on /dev/sda7.  Priority:1 extents:1 across:497972k
Adding 497972k swap on /dev/sdb7.  Priority:1 extents:1 across:497972k
Unable to handle kernel paging request at virtual address 00316173
 printing eip:
00316173
*pde = 
Oops:  [#1]
SMP
last sysfs file:
Modules linked in: binfmt_misc reiser4 zlib_deflate zlib_inflate dm_mod video 
thermal processor hotkey fan button ac i8xx_tco i2c_i8

01
CPU:0
EIP:0060:[00316173]Not tainted VLI
EFLAGS: 00010202   (2.6.13-rc3-mm2)
EIP is at 0x316173
eax: dfc05d24   ebx: dfc05d24   ecx: 00316173   edx: de87
esi: de87   edi: dfc05d2c   ebp: df4e5f3c   esp: df4e5f30
ds: 007b   es: 007b   ss: 0068
Process udev (pid: 1141, threadinfo=df4e4000 task=df24ea50)
Stack: c02135a7 dfc6f0e8 c037edf4 df4e5f54 c018b5c3 de5d2bec dfc6f0e8 dfc8b1ec
   1000 df4e5f74 c018b6fe df989030 080659b0 dfc6f0fc dfc8b1ec 1000
   c018b6b8 df4e5f94 c0157c8f df4e5fa0 080659b0  dfc8b1ec fff7
Call Trace:
 [c0103983] show_stack+0x94/0xca
 [c0103b37] show_registers+0x165/0x1f9
 [c0103d5d] die+0x108/0x183
 [c0318c3a] do_page_fault+0x1ea/0x63d
 [c0103657] error_code+0x4f/0x54
 [c018b5c3] fill_read_buffer+0x2e/0x74
 [c018b6fe] sysfs_read_file+0x46/0x76
 [c0157c8f] vfs_read+0x8a/0x146
 [c0157fd7] sys_read+0x3d/0x64
 [c0102ae7] sysenter_past_esp+0x54/0x75
Code:  Bad EIP value.
 6NET: Registered protocol family 10
IPv6 over IPv4 tunneling driver

The machine continues on booting..

reuben

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc3-mm2

2005-07-28 Thread Reuben Farrelly



On 28/07/2005 9:10 p.m., Andrew Morton wrote:

Reuben Farrelly [EMAIL PROTECTED] wrote:

On 27/07/2005 9:45 a.m., Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm2/


- Lots of fixes and updates all over the place.  There are probably over 100
  patches here which need to go into 2.6.13.

- A reminder that -mm commit activity may be monitored by subscribing to
  the mm-commits list.  Do

echo subscribe mm-commits | mail [EMAIL PROTECTED]


Also seeing this during boot-up:


This was happening in earlier -mm's was it not?


Hadn't seen it anytime recently..


last sysfs file:


grr, I need to fix that.


  [c0103983] show_stack+0x94/0xca
  [c0103b37] show_registers+0x165/0x1f9
  [c0103d5d] die+0x108/0x183
  [c0318c3a] do_page_fault+0x1ea/0x63d
  [c0103657] error_code+0x4f/0x54
  [c018b5c3] fill_read_buffer+0x2e/0x74
  [c018b6fe] sysfs_read_file+0x46/0x76


some dud sysfs file.


Didn't appear after a reboot of 2.6.13-rc3-mm2, and doesn't appear with 
2.6.13-rc3-mm3, so not too sure what to make of it now.  Will see if it 
reappears (box is otherwise stable).


Thanks,
reuben

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc3-mm2

2005-07-27 Thread Reuben Farrelly

Hi,

On 27/07/2005 9:45 a.m., Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm2/


- Lots of fixes and updates all over the place.  There are probably over 100
  patches here which need to go into 2.6.13.

- A reminder that -mm commit activity may be monitored by subscribing to
  the mm-commits list.  Do

echo subscribe mm-commits | mail [EMAIL PROTECTED]




Changes since 2.6.13-rc3-mm1:


A few more warnings in mostly the reiser4 code in this one compared to -mm1:


  LD  fs/ramfs/ramfs.o
  LD  fs/ramfs/built-in.o
  LD  fs/reiser4/built-in.o
  CC [M]  fs/reiser4/debug.o
In file included from fs/reiser4/plugin/plugin.h:26,
 from fs/reiser4/jnode.h:19,
 from fs/reiser4/lock.h:16,
 from fs/reiser4/context.h:15,
 from fs/reiser4/debug.c:32:
fs/reiser4/plugin/node/node40.h:83:5: warning: "GUESS_EXISTS" is not defined
  CC [M]  fs/reiser4/jnode.o


about 20 or so times during this part of the compilation, however it never 
quite bombs out.


and this one:


In file included from fs/reiser4/plugin/plugin.h:26,
 from fs/reiser4/jnode.h:19,
 from fs/reiser4/seal.c:42:
fs/reiser4/plugin/node/node40.h:83:5: warning: "GUESS_EXISTS" is not defined
fs/reiser4/seal.c:212:5: warning: "REISER4_DEBUG_OUTPUT" is not defined
  CC [M]  fs/reiser4/dscale.o
  CC [M]  fs/reiser4/flush_queue.o


  CC  net/ipv4/netfilter/ip_conntrack_core.o
net/ipv4/netfilter/ip_conntrack_core.c:726:5: warning: 
"CONFIG_IP_NF_CONNTRACK_MARK" is not defined

  CC  net/ipv4/netfilter/ip_conntrack_proto_generic.o


  CC  drivers/scsi/aic7xxx/aic7xxx_core.o
In file included from drivers/scsi/aic7xxx/aic7xxx_core.c:48:
drivers/scsi/aic7xxx/aicasm/aicasm_insformat.h:46:5: warning: "BYTE_ORDER" is 
not defined
drivers/scsi/aic7xxx/aicasm/aicasm_insformat.h:46:19: warning: "LITTLE_ENDIAN" 
is not defined
drivers/scsi/aic7xxx/aicasm/aicasm_insformat.h:64:5: warning: "BYTE_ORDER" is 
not defined
drivers/scsi/aic7xxx/aicasm/aicasm_insformat.h:64:19: warning: "LITTLE_ENDIAN" 
is not defined
drivers/scsi/aic7xxx/aicasm/aicasm_insformat.h:82:5: warning: "BYTE_ORDER" is 
not defined
drivers/scsi/aic7xxx/aicasm/aicasm_insformat.h:82:19: warning: "LITTLE_ENDIAN" 
is not defined





reuben

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc3-mm2

2005-07-27 Thread Reuben Farrelly

Hi,

On 27/07/2005 9:45 a.m., Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm2/


- Lots of fixes and updates all over the place.  There are probably over 100
  patches here which need to go into 2.6.13.

- A reminder that -mm commit activity may be monitored by subscribing to
  the mm-commits list.  Do

echo subscribe mm-commits | mail [EMAIL PROTECTED]




Changes since 2.6.13-rc3-mm1:


A few more warnings in mostly the reiser4 code in this one compared to -mm1:


  LD  fs/ramfs/ramfs.o
  LD  fs/ramfs/built-in.o
  LD  fs/reiser4/built-in.o
  CC [M]  fs/reiser4/debug.o
In file included from fs/reiser4/plugin/plugin.h:26,
 from fs/reiser4/jnode.h:19,
 from fs/reiser4/lock.h:16,
 from fs/reiser4/context.h:15,
 from fs/reiser4/debug.c:32:
fs/reiser4/plugin/node/node40.h:83:5: warning: GUESS_EXISTS is not defined
  CC [M]  fs/reiser4/jnode.o


about 20 or so times during this part of the compilation, however it never 
quite bombs out.


and this one:


In file included from fs/reiser4/plugin/plugin.h:26,
 from fs/reiser4/jnode.h:19,
 from fs/reiser4/seal.c:42:
fs/reiser4/plugin/node/node40.h:83:5: warning: GUESS_EXISTS is not defined
fs/reiser4/seal.c:212:5: warning: REISER4_DEBUG_OUTPUT is not defined
  CC [M]  fs/reiser4/dscale.o
  CC [M]  fs/reiser4/flush_queue.o


  CC  net/ipv4/netfilter/ip_conntrack_core.o
net/ipv4/netfilter/ip_conntrack_core.c:726:5: warning: 
CONFIG_IP_NF_CONNTRACK_MARK is not defined

  CC  net/ipv4/netfilter/ip_conntrack_proto_generic.o


  CC  drivers/scsi/aic7xxx/aic7xxx_core.o
In file included from drivers/scsi/aic7xxx/aic7xxx_core.c:48:
drivers/scsi/aic7xxx/aicasm/aicasm_insformat.h:46:5: warning: BYTE_ORDER is 
not defined
drivers/scsi/aic7xxx/aicasm/aicasm_insformat.h:46:19: warning: LITTLE_ENDIAN 
is not defined
drivers/scsi/aic7xxx/aicasm/aicasm_insformat.h:64:5: warning: BYTE_ORDER is 
not defined
drivers/scsi/aic7xxx/aicasm/aicasm_insformat.h:64:19: warning: LITTLE_ENDIAN 
is not defined
drivers/scsi/aic7xxx/aicasm/aicasm_insformat.h:82:5: warning: BYTE_ORDER is 
not defined
drivers/scsi/aic7xxx/aicasm/aicasm_insformat.h:82:19: warning: LITTLE_ENDIAN 
is not defined





reuben

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.12-rc2-mm1

2005-04-05 Thread Reuben Farrelly
Hi again
At 12:14 a.m. 6/04/2005, Adrian Bunk wrote:
On Tue, Apr 05, 2005 at 08:34:11PM +1200, Reuben Farrelly wrote:
> Hi,
Hi Reuben,
>...
> Hrm. Something changed between the last -mm release which compiled
> through, and this one..
>...
>   LD  .tmp_vmlinux1
> arch/i386/kernel/built-in.o(.init.text+0x1823): In function `setup_arch':
> : undefined reference to `acpi_boot_table_init'
> arch/i386/kernel/built-in.o(.init.text+0x1828): In function `setup_arch':
> : undefined reference to `acpi_boot_init'
> make: *** [.tmp_vmlinux1] Error 1
> [EMAIL PROTECTED] linux-2.6]#
>
> Backing out bk-acpi.patch works around it..
Please send your .config .
Have just figured out that it seems to be caused by having ACPI 
disabled in .config, once I re-enabled ACPI the build problem went away.

Config attached anyway, I imagine the problem is quite reproduceable..
Reuben



.config
Description: Binary data


Re: 2.6.12-rc2-mm1

2005-04-05 Thread Reuben Farrelly
Hi,
Andrew Morton wrote:
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm1/
- x86 NMI handling seems to be bust in 2.6.12-rc2.  Try using
  `nmi_watchdog=0' if you experience weird crashes.
- The possible kernel-timer related hangs might possibly be fixed.  We
  haven't heard yet.
- Nobody said anything about the PM resume and DRI behaviour in
  2.6.12-rc1-mm4.  So it's all perfect now?
- Various fixes and updates.  Nothing earth-shattering.

Changes since 2.6.12-rc1-mm4:
 bk-acpi.patch
 bk-agpgart.patch
 bk-cifs.patch
 bk-cpufreq.patch
 bk-cryptodev.patch
 bk-driver-core.patch
 bk-drm.patch
 bk-drm-via.patch
 bk-ia64.patch
 bk-audit.patch
 bk-input.patch
 bk-jfs.patch
 bk-kbuild.patch
 bk-mtd.patch
 bk-netdev.patch
 bk-nfs.patch
 bk-ntfs.patch
 bk-scsi.patch
 bk-watchdog.patch
 Latest versions of subsystem trees
Hrm. Something changed between the last -mm release which compiled 
through, and this one..

 CHK include/linux/compile.h
  CHK usr/initramfs_list
  GEN .version
  CHK include/linux/compile.h
  UPD include/linux/compile.h
  CC  init/version.o
  LD  init/built-in.o
  LD  .tmp_vmlinux1
arch/i386/kernel/built-in.o(.init.text+0x1823): In function `setup_arch':
: undefined reference to `acpi_boot_table_init'
arch/i386/kernel/built-in.o(.init.text+0x1828): In function `setup_arch':
: undefined reference to `acpi_boot_init'
make: *** [.tmp_vmlinux1] Error 1
[EMAIL PROTECTED] linux-2.6]#
Backing out bk-acpi.patch works around it..
reuben
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.12-rc2-mm1

2005-04-05 Thread Reuben Farrelly
Hi,
Andrew Morton wrote:
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm1/
- x86 NMI handling seems to be bust in 2.6.12-rc2.  Try using
  `nmi_watchdog=0' if you experience weird crashes.
- The possible kernel-timer related hangs might possibly be fixed.  We
  haven't heard yet.
- Nobody said anything about the PM resume and DRI behaviour in
  2.6.12-rc1-mm4.  So it's all perfect now?
- Various fixes and updates.  Nothing earth-shattering.

Changes since 2.6.12-rc1-mm4:
 bk-acpi.patch
 bk-agpgart.patch
 bk-cifs.patch
 bk-cpufreq.patch
 bk-cryptodev.patch
 bk-driver-core.patch
 bk-drm.patch
 bk-drm-via.patch
 bk-ia64.patch
 bk-audit.patch
 bk-input.patch
 bk-jfs.patch
 bk-kbuild.patch
 bk-mtd.patch
 bk-netdev.patch
 bk-nfs.patch
 bk-ntfs.patch
 bk-scsi.patch
 bk-watchdog.patch
 Latest versions of subsystem trees
Hrm. Something changed between the last -mm release which compiled 
through, and this one..

 CHK include/linux/compile.h
  CHK usr/initramfs_list
  GEN .version
  CHK include/linux/compile.h
  UPD include/linux/compile.h
  CC  init/version.o
  LD  init/built-in.o
  LD  .tmp_vmlinux1
arch/i386/kernel/built-in.o(.init.text+0x1823): In function `setup_arch':
: undefined reference to `acpi_boot_table_init'
arch/i386/kernel/built-in.o(.init.text+0x1828): In function `setup_arch':
: undefined reference to `acpi_boot_init'
make: *** [.tmp_vmlinux1] Error 1
[EMAIL PROTECTED] linux-2.6]#
Backing out bk-acpi.patch works around it..
reuben
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.12-rc2-mm1

2005-04-05 Thread Reuben Farrelly
Hi again
At 12:14 a.m. 6/04/2005, Adrian Bunk wrote:
On Tue, Apr 05, 2005 at 08:34:11PM +1200, Reuben Farrelly wrote:
 Hi,
Hi Reuben,
...
 Hrm. Something changed between the last -mm release which compiled
 through, and this one..
...
   LD  .tmp_vmlinux1
 arch/i386/kernel/built-in.o(.init.text+0x1823): In function `setup_arch':
 : undefined reference to `acpi_boot_table_init'
 arch/i386/kernel/built-in.o(.init.text+0x1828): In function `setup_arch':
 : undefined reference to `acpi_boot_init'
 make: *** [.tmp_vmlinux1] Error 1
 [EMAIL PROTECTED] linux-2.6]#

 Backing out bk-acpi.patch works around it..
Please send your .config .
Have just figured out that it seems to be caused by having ACPI 
disabled in .config, once I re-enabled ACPI the build problem went away.

Config attached anyway, I imagine the problem is quite reproduceable..
Reuben



.config
Description: Binary data


Re: 2.6.12-rc1-mm3

2005-04-01 Thread Reuben Farrelly
Hi Dmitry and others,
At 06:41 a.m. 31/03/2005, Dmitry Torokhov wrote:
On Monday 28 March 2005 06:02, Russell King wrote:
> Looks like something in the input layer went bang.  The code in
> serport_ldisc_write_wakeup is:
>
>0:   8b 80 a8 09 00 00   mov0x9a8(%eax),%eax
>6:   8b 40 14mov0x14(%eax),%eax
>9:   8b 50 70mov0x70(%eax),%edx <
>c:   85 d2   test   %edx,%edx
>e:   74 09   je 0x19
>
> and the marked line exploded on you.  The above instructions correspond
> with:
>
> 0:  struct serport *sp = (struct serport *) tty->disc_data;
> 6:  serio_drv_write_wakeup(sp->serio);
> 9:  if (serio->drv
>
> So, "serio" was this strange 0xf3a6cdf8 value.  But why?  One for the
> input people I think.
Reuben, could you please try the patch below? Thanks!
Russell, could you please tell me if ldisc->write_wakeup (tty_wakwup) and
ldisc->read are allowed to be called from an IRQ context? IOW I wonder if
I can use spil_lock_bh instead of spil_lock_irqsave to protect serport
flags.
--
Dmitry
 serport.c |   98 
+++---
 1 files changed, 68 insertions(+), 30 deletions(-)

Index: dtor/drivers/input/serio/serport.c
===
--- dtor.orig/drivers/input/serio/serport.c
+++ dtor/drivers/input/serio/serport.c
@@ -27,11 +27,15 @@ MODULE_LICENSE("GPL");
 MODULE_ALIAS_LDISC(N_MOUSE);

I've done some testing this afternoon and it seems that this patch 
fixes the problem in -mm4.  I don't even have a serial 
mouse/keyboard, but do have a serial PCI card onboard.  The box has a 
USB connection to a Belkin KVM instead of directly attached input devices.

I also note that it is occurring on kernel-smp-2.6.11-1.1219_FC4 - so 
it is probably a problem in mainline as well as -mm.

Now I'm crashing a bit further through the shutdown, here's the stacktrace:
INIT: Sending processes the TERM signal
Stopping yum:  Disabling nightly yum update: [  OK  ]
[  OK  ]
Stopping cups-config-daemon: [  OK  ]
Stopping HAL daemon: [  OK  ]
Stopping system message bus: [  OK  ]
Stopping atd: [  OK  ]
Stopping cups: [  OK  ]
Shutting down xfs: [  OK  ]
[  OK  ] down console mouse services: [  OK  ]
Shutting down NFS mountd: [  OK  ]
Shutting down NFS daemon: nfsd: last server has exited
nfsd: unexporting all filesystems
RPC: error 5 connecting to server localhost
RPC: failed to contact portmap (errno -5).
Unable to handle kernel paging request at virtual address f2826d2c
 printing eip:
c01337a9
*pde = 
Oops:  [#1]
SMP DEBUG_PAGEALLOC
Modules linked in: nfsd exportfs md5 ipv6 lp snd_usb_audio 
snd_usb_lib pwc video
dev usb_storage autofs4 eeprom lm85 i2c_sensor rfcomm l2cap bluetooth nfs lockd
sunrpc dm_mod video button battery ac ohci1394 ieee1394 uhci_hcd 
ehci_hcd parpor
t_serial parport_pc parport hw_random i2c_i801 i2c_core emu10k1_gp 
gameport snd_
emu10k1 snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm_oss 
snd_mixer_oss snd_
pcm snd_timer snd_page_alloc snd_util_mem snd_hwdep snd soundcore 
e100 mii flopp
y ext3 jbd ata_piix libata sd_mod scsi_mod
CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00010087   (2.6.12-rc1-mm4)
EIP is at worker_thread+0x149/0x230
eax: 0001   ebx: 0212   ecx: f7eb4018   edx: f2826d20
esi: f2826d24   edi: f7eb4000   ebp:    esp: f7e83f7c
ds: 007b   es: 007b   ss: 0068
Process events/0 (pid: 8, threadinfo=f7e83000 task=f7fefad0)
Stack: f7eb4028 f7eb4010 f7eb4018 f7e83000 f2826d20 c014f4b0 0001 
   000f41fa 0001   f7fefad0 c011ea50 00100100 00200200
     fffc f7e46f54 f7eb4000 c0133660 c0137694 
Call Trace:
 [] cache_reap+0x0/0x240
 [] default_wake_function+0x0/0x10
 [] worker_thread+0x0/0x230
 [] kthread+0x94/0xa0
 [] kthread+0x0/0xa0
 [] kernel_thread_helper+0x5/0x10
Code: 00 00 89 f8 e8 19 e3 1e 00 89 c3 8b 47 40 40 89 47 40 83 f8 03 
0f 8f bd 00
 00 00 8b 77 10 3b 74 24 04 74 71 8d 56 fc 89 54 24 10 <8b> 42 0c 89 
44 24 14 8b
 6a 10 8b 46 04 8b 16 89 10 89 36 89 42
 [  OK  ]
Shutting down NFS quotas: [FAILED]
Shutting down NFS services:  [  OK  ]
Stopping sshd: [  OK  ]
Stopping postfix:  Shutting down postfix: <3>BUG: soft lockup 
detected on CPU#0!

Pid: 3413, comm:  rpc.rquotad
EIP: 0060:[] CPU: 0
EIP is at _spin_lock_irqsave+0x20/0x50
 EFLAGS: 0286Not tainted  (2.6.12-rc1-mm4)
EAX: f7eb4000 EBX: 0246 ECX: f7eb4000 EDX: c22021a0
ESI: f7eb4000 EDI: c22021a0 EBP: c01335b0 DS: 007b ES: 007b
CR0: 8005003b CR2: 800147fc CR3: 37256d20 CR4: 06e0
 [] __queue_work+0xc/0x50
 [] run_timer_softirq+0xd7/0x1c0
 [] __do_softirq+0x80/0x100
 [] do_softirq+0x4b/0x50
 ===
 [] apic_timer_interrupt+0x1c/0x30
 [] kfree_skbmem+0x8/0x20
 [] cpufreq_governor+0x3b/0x50
 [] kfree+0x62/0x90
 [] kfree_skbmem+0x8/0x20
 [] __kfree_skb+0xdc/0x1a0
 [] netlink_recvmsg+0xf1/0x230
 [] 

Re: 2.6.12-rc1-mm3

2005-04-01 Thread Reuben Farrelly
Hi Dmitry and others,
At 06:41 a.m. 31/03/2005, Dmitry Torokhov wrote:
On Monday 28 March 2005 06:02, Russell King wrote:
 Looks like something in the input layer went bang.  The code in
 serport_ldisc_write_wakeup is:

0:   8b 80 a8 09 00 00   mov0x9a8(%eax),%eax
6:   8b 40 14mov0x14(%eax),%eax
9:   8b 50 70mov0x70(%eax),%edx 
c:   85 d2   test   %edx,%edx
e:   74 09   je 0x19

 and the marked line exploded on you.  The above instructions correspond
 with:

 0:  struct serport *sp = (struct serport *) tty-disc_data;
 6:  serio_drv_write_wakeup(sp-serio);
 9:  if (serio-drv

 So, serio was this strange 0xf3a6cdf8 value.  But why?  One for the
 input people I think.
Reuben, could you please try the patch below? Thanks!
Russell, could you please tell me if ldisc-write_wakeup (tty_wakwup) and
ldisc-read are allowed to be called from an IRQ context? IOW I wonder if
I can use spil_lock_bh instead of spil_lock_irqsave to protect serport
flags.
--
Dmitry
 serport.c |   98 
+++---
 1 files changed, 68 insertions(+), 30 deletions(-)

Index: dtor/drivers/input/serio/serport.c
===
--- dtor.orig/drivers/input/serio/serport.c
+++ dtor/drivers/input/serio/serport.c
@@ -27,11 +27,15 @@ MODULE_LICENSE(GPL);
 MODULE_ALIAS_LDISC(N_MOUSE);

I've done some testing this afternoon and it seems that this patch 
fixes the problem in -mm4.  I don't even have a serial 
mouse/keyboard, but do have a serial PCI card onboard.  The box has a 
USB connection to a Belkin KVM instead of directly attached input devices.

I also note that it is occurring on kernel-smp-2.6.11-1.1219_FC4 - so 
it is probably a problem in mainline as well as -mm.

Now I'm crashing a bit further through the shutdown, here's the stacktrace:
INIT: Sending processes the TERM signal
Stopping yum:  Disabling nightly yum update: [  OK  ]
[  OK  ]
Stopping cups-config-daemon: [  OK  ]
Stopping HAL daemon: [  OK  ]
Stopping system message bus: [  OK  ]
Stopping atd: [  OK  ]
Stopping cups: [  OK  ]
Shutting down xfs: [  OK  ]
[  OK  ] down console mouse services: [  OK  ]
Shutting down NFS mountd: [  OK  ]
Shutting down NFS daemon: nfsd: last server has exited
nfsd: unexporting all filesystems
RPC: error 5 connecting to server localhost
RPC: failed to contact portmap (errno -5).
Unable to handle kernel paging request at virtual address f2826d2c
 printing eip:
c01337a9
*pde = 
Oops:  [#1]
SMP DEBUG_PAGEALLOC
Modules linked in: nfsd exportfs md5 ipv6 lp snd_usb_audio 
snd_usb_lib pwc video
dev usb_storage autofs4 eeprom lm85 i2c_sensor rfcomm l2cap bluetooth nfs lockd
sunrpc dm_mod video button battery ac ohci1394 ieee1394 uhci_hcd 
ehci_hcd parpor
t_serial parport_pc parport hw_random i2c_i801 i2c_core emu10k1_gp 
gameport snd_
emu10k1 snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm_oss 
snd_mixer_oss snd_
pcm snd_timer snd_page_alloc snd_util_mem snd_hwdep snd soundcore 
e100 mii flopp
y ext3 jbd ata_piix libata sd_mod scsi_mod
CPU:0
EIP:0060:[c01337a9]Not tainted VLI
EFLAGS: 00010087   (2.6.12-rc1-mm4)
EIP is at worker_thread+0x149/0x230
eax: 0001   ebx: 0212   ecx: f7eb4018   edx: f2826d20
esi: f2826d24   edi: f7eb4000   ebp:    esp: f7e83f7c
ds: 007b   es: 007b   ss: 0068
Process events/0 (pid: 8, threadinfo=f7e83000 task=f7fefad0)
Stack: f7eb4028 f7eb4010 f7eb4018 f7e83000 f2826d20 c014f4b0 0001 
   000f41fa 0001   f7fefad0 c011ea50 00100100 00200200
     fffc f7e46f54 f7eb4000 c0133660 c0137694 
Call Trace:
 [c014f4b0] cache_reap+0x0/0x240
 [c011ea50] default_wake_function+0x0/0x10
 [c0133660] worker_thread+0x0/0x230
 [c0137694] kthread+0x94/0xa0
 [c0137600] kthread+0x0/0xa0
 [c01023f5] kernel_thread_helper+0x5/0x10
Code: 00 00 89 f8 e8 19 e3 1e 00 89 c3 8b 47 40 40 89 47 40 83 f8 03 
0f 8f bd 00
 00 00 8b 77 10 3b 74 24 04 74 71 8d 56 fc 89 54 24 10 8b 42 0c 89 
44 24 14 8b
 6a 10 8b 46 04 8b 16 89 10 89 36 89 42
 [  OK  ]
Shutting down NFS quotas: [FAILED]
Shutting down NFS services:  [  OK  ]
Stopping sshd: [  OK  ]
Stopping postfix:  Shutting down postfix: 3BUG: soft lockup 
detected on CPU#0!

Pid: 3413, comm:  rpc.rquotad
EIP: 0060:[c0321ac0] CPU: 0
EIP is at _spin_lock_irqsave+0x20/0x50
 EFLAGS: 0286Not tainted  (2.6.12-rc1-mm4)
EAX: f7eb4000 EBX: 0246 ECX: f7eb4000 EDX: c22021a0
ESI: f7eb4000 EDI: c22021a0 EBP: c01335b0 DS: 007b ES: 007b
CR0: 8005003b CR2: 800147fc CR3: 37256d20 CR4: 06e0
 [c013350c] __queue_work+0xc/0x50
 [c012cc17] run_timer_softirq+0xd7/0x1c0
 [c0128950] __do_softirq+0x80/0x100
 [c0106adb] do_softirq+0x4b/0x50
 ===
 [c010511c] apic_timer_interrupt+0x1c/0x30
 [c02b7ed8] kfree_skbmem+0x8/0x20
 [c02b007b] cpufreq_governor+0x3b/0x50
 [c014eed2] 

Re: 2.6.12-rc1-mm3

2005-03-28 Thread Reuben Farrelly
Reuben Farrelly wrote:
I'm repeatably getting this crash on shutdown in -mm3, and a few 
releases earlier (but I can't be certain it was the same crash..)

Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing enabled
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
ttyS4 at I/O 0xa400 (irq = 16) is a 16550A
ttyS5 at I/O 0xa408 (irq = 16) is a 16550A
This _may_ be the culprit, but I'm not sure:
03:03.0 Serial controller: Timedia Technology Co Ltd PCI2S550 (Dual 
16550 UART) (rev 01) (prog-if 02 [16550])
Subsystem: Timedia Technology Co Ltd: Unknown device 0002
Flags: stepping, medium devsel, IRQ 16
I/O ports at a400 [size=32]
Ugh.  I'm an idiot, that will teach me for having two sessions to boxes 
running at once.

Wrong info above, but the trace is still valid.
Correct info follows:
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS14 at I/O 0xb400 (irq = 217) is a 16550A
ttyS15 at I/O 0xb000 (irq = 217) is a 16550A
06:02.0 Serial controller: NetMos Technology PCI 9835 Multi-I/O 
Controller (rev
01) (prog-if 02 [16550])
Subsystem: LSI Logic / Symbios Logic 2S (16C550 UART)
Flags: medium devsel, IRQ 217
I/O ports at b400 [size=8]
I/O ports at b000 [size=8]
I/O ports at ac00 [size=8]
I/O ports at a800 [size=8]
I/O ports at a400 [size=8]
I/O ports at a000 [size=16]


The board is an Intel D925XCV.
Shutdown goes like this:   (yes, hyperterminal sucks for the ^M 
characters, sorry)

reuben
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.12-rc1-mm3

2005-03-28 Thread Reuben Farrelly
Hi,
Andrew Morton wrote:
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc1/2.6.12-rc1-mm3/
- Mainly a bunch of fixes relative to 2.6.12-rc1-mm2.
- Again, we'd like people who have had recent DRM and USB resume problems to
  test and report, please.
- The bk-ide-dev tree is back after a couple of weeks of difficulties.
- Jeff asks that anyone who has had problems with the Silicon Image SATA
  drivers test sata_sil-corruption--lockup-fix.patch, which is included in
  this kernel.
I'm repeatably getting this crash on shutdown in -mm3, and a few 
releases earlier (but I can't be certain it was the same crash..)

Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing enabled
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
ttyS4 at I/O 0xa400 (irq = 16) is a 16550A
ttyS5 at I/O 0xa408 (irq = 16) is a 16550A
This _may_ be the culprit, but I'm not sure:
03:03.0 Serial controller: Timedia Technology Co Ltd PCI2S550 (Dual 
16550 UART) (rev 01) (prog-if 02 [16550])
Subsystem: Timedia Technology Co Ltd: Unknown device 0002
Flags: stepping, medium devsel, IRQ 16
I/O ports at a400 [size=32]

The board is an Intel D925XCV.
Shutdown goes like this:   (yes, hyperterminal sucks for the ^M 
characters, sorry)

INIT: Switching^MINIT: Sending processes the TERM signal
Stopping yum:  Disabling nightly yum update: [  OK  ]
[  OK  ]
Stopping cups-config-daemon: [  OK  ]
Stopping HAL daemon: [  OK  ]
Stopping system message bus: [  OK  ]
Stopping atd: [  OK  ]
Stopping cups: [  OK  ]
Shutting down xfs: [  OK  ]
Shutting down console mouse services: [  OK  ]
Unable to handle kernel paging request at virtual address f3a6ce68
 printing eip:
c0244109
*pde = 
Oops:  [#1]
SMP DEBUG_PAGEALLOC
Modules linked in: hidp hci_usb sermouse nfsd exportfs md5 ipv6 lp 
autofs4 eeprom lm85 i2c_sensor rfcomm l2cap bluetooth nfs lock
d sunrpc usb_storage pwc videodev dm_mod video button battery ac 
ohci1394 ieee1394 uhci_hcd ehci_hcd parport_serial parport_pc parp
ort hw_random i2c_i801 i2c_core emu10k1_gp gameport e100 mii floppy ext3 
jbd ata_piix libata sd_mod scsi_mod
CPU:0
EIP:0060:[]Not tainted VLI
EFLAGS: 00010286   (2.6.12-rc1-mm3)
EIP is at serport_ldisc_write_wakeup+0x9/0x20
eax: f3a6cdf8   ebx: f73d7000   ecx: c038e374   edx: c0244100
esi: f73d700c   edi: f73d7000   ebp: c049e900   esp: f7568dc0
ds: 007b   es: 007b   ss: 0068
Process inputattach (pid: 2932, threadinfo=f7568000 task=f6993ac0)
Stack: c021bb08 0286 f6c31000 c0245e4a f6c31018 f73d7000 f67c1e88 
cbff5c
    c021ceaa    c1e46000 c1e46000 

    c011b739 0046 c1e46000 0001 f2c0 f2c0 
c011b8b4
Call Trace:
^M [] tty_wakeup+0x48/0x70
^M [] uart_close+0xca/0x1e0
^M [] release_dev+0x14a/0x750
^M [] change_page_attr+0x29/0x60
^M [] kernel_map_pages+0x84/0xa0
^M [] store_stackinfo+0x5a/0x90
^M [] __fput+0x108/0x180
^M [] inotify_inode_queue_event+0x2b/0x40
^M [] tty_release+0xf/0x20
^M [] __fput+0x8a/0x180
^M [] filp_close+0x4b/0x70
^M [] put_files_struct+0x74/0x100
^M [] do_exit+0x11c/0x420
^M [] do_group_exit+0x2d/0xa0
^M [] get_signal_to_deliver+0x20c/0x310
^M [] do_signal+0x5b/0x140
^M [] __wake_up+0x29/0x40
^M [] tty_ldisc_deref+0x3c/0x70
^M [] tty_read+0xc7/0x130
^M [] serport_ldisc_read+0x0/0x100
^M [] sys_fstat64+0x23/0x30
^M [] tty_read+0x0/0x130
^M [] vfs_read+0x97/0x140
^M [] sys_read+0x3c/0x70
^M [] do_notify_resume+0x2a/0x40
^M [] work_notifysig+0x13/0x25
^MCode: e8 0f b6 c5 88 4b 4b 31 d2 c1 e9 10 88 43 4a 88 4b 49 89 d0 5b 
c3 8d b6 00 00 00 00 8d bf 00 00 00 00 8b 80 a8 09 00 00 8b
40 14 <8b> 50 70 85 d2 74 09 8b 52 10 85 d2 74 02 ff d2 c3 90 90 90 90
^M BUG: atomic counter underflow at:
^M [] do_exit+0x396/0x420
^M [] die+0x166/0x170
^M [] do_page_fault+0x1f3/0x6a1
^M [] serport_ldisc_write_wakeup+0x9/0x20
^M [] __change_page_attr+0x4c/0x3f0
^M [] do_page_fault+0x0/0x6a1
^M [] error_code+0x4f/0x60
^M [] serport_ldisc_write_wakeup+0x0/0x20
^M [] serport_ldisc_write_wakeup+0x9/0x20
^M [] tty_wakeup+0x48/0x70
^M [] uart_close+0xca/0x1e0
^M [] release_dev+0x14a/0x750
^M [] change_page_attr+0x29/0x60
^M [] kernel_map_pages+0x84/0xa0
^M [] store_stackinfo+0x5a/0x90
^M [] __fput+0x108/0x180
^M [] inotify_inode_queue_event+0x2b/0x40
^M [] tty_release+0xf/0x20
^M [] __fput+0x8a/0x180
^M [] filp_close+0x4b/0x70
^M [] put_files_struct+0x74/0x100
^M [] do_exit+0x11c/0x420
^M [] do_group_exit+0x2d/0xa0
^M [] get_signal_to_deliver+0x20c/0x310
^M [] do_signal+0x5b/0x140
^M [] __wake_up+0x29/0x40
^M [] tty_ldisc_deref+0x3c/0x70
^M [] tty_read+0xc7/0x130
^M [] serport_ldisc_read+0x0/0x100
^M [] sys_fstat64+0x23/0x30
^M [] tty_read+0x0/0x130
^M [] vfs_read+0x97/0x140
^M [] sys_read+0x3c/0x70
^M [] do_notify_resume+0x2a/0x40
^M [] work_notifysig+0x13/0x25
^MUnable to handle kernel NULL pointer dereference at virtual address 
0020
^M printing eip:
^Mc0121320
^M*pde = 0041f001
^MOops:  [#2]
^MSMP 

Re: 2.6.12-rc1-mm3

2005-03-28 Thread Reuben Farrelly
Hi,
Andrew Morton wrote:
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc1/2.6.12-rc1-mm3/
- Mainly a bunch of fixes relative to 2.6.12-rc1-mm2.
- Again, we'd like people who have had recent DRM and USB resume problems to
  test and report, please.
- The bk-ide-dev tree is back after a couple of weeks of difficulties.
- Jeff asks that anyone who has had problems with the Silicon Image SATA
  drivers test sata_sil-corruption--lockup-fix.patch, which is included in
  this kernel.
I'm repeatably getting this crash on shutdown in -mm3, and a few 
releases earlier (but I can't be certain it was the same crash..)

Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing enabled
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
ttyS4 at I/O 0xa400 (irq = 16) is a 16550A
ttyS5 at I/O 0xa408 (irq = 16) is a 16550A
This _may_ be the culprit, but I'm not sure:
03:03.0 Serial controller: Timedia Technology Co Ltd PCI2S550 (Dual 
16550 UART) (rev 01) (prog-if 02 [16550])
Subsystem: Timedia Technology Co Ltd: Unknown device 0002
Flags: stepping, medium devsel, IRQ 16
I/O ports at a400 [size=32]

The board is an Intel D925XCV.
Shutdown goes like this:   (yes, hyperterminal sucks for the ^M 
characters, sorry)

INIT: Switching^MINIT: Sending processes the TERM signal
Stopping yum:  Disabling nightly yum update: [  OK  ]
[  OK  ]
Stopping cups-config-daemon: [  OK  ]
Stopping HAL daemon: [  OK  ]
Stopping system message bus: [  OK  ]
Stopping atd: [  OK  ]
Stopping cups: [  OK  ]
Shutting down xfs: [  OK  ]
Shutting down console mouse services: [  OK  ]
Unable to handle kernel paging request at virtual address f3a6ce68
 printing eip:
c0244109
*pde = 
Oops:  [#1]
SMP DEBUG_PAGEALLOC
Modules linked in: hidp hci_usb sermouse nfsd exportfs md5 ipv6 lp 
autofs4 eeprom lm85 i2c_sensor rfcomm l2cap bluetooth nfs lock
d sunrpc usb_storage pwc videodev dm_mod video button battery ac 
ohci1394 ieee1394 uhci_hcd ehci_hcd parport_serial parport_pc parp
ort hw_random i2c_i801 i2c_core emu10k1_gp gameport e100 mii floppy ext3 
jbd ata_piix libata sd_mod scsi_mod
CPU:0
EIP:0060:[c0244109]Not tainted VLI
EFLAGS: 00010286   (2.6.12-rc1-mm3)
EIP is at serport_ldisc_write_wakeup+0x9/0x20
eax: f3a6cdf8   ebx: f73d7000   ecx: c038e374   edx: c0244100
esi: f73d700c   edi: f73d7000   ebp: c049e900   esp: f7568dc0
ds: 007b   es: 007b   ss: 0068
Process inputattach (pid: 2932, threadinfo=f7568000 task=f6993ac0)
Stack: c021bb08 0286 f6c31000 c0245e4a f6c31018 f73d7000 f67c1e88 
cbff5c
    c021ceaa    c1e46000 c1e46000 

    c011b739 0046 c1e46000 0001 f2c0 f2c0 
c011b8b4
Call Trace:
^M [c021bb08] tty_wakeup+0x48/0x70
^M [c0245e4a] uart_close+0xca/0x1e0
^M [c021ceaa] release_dev+0x14a/0x750
^M [c011b739] change_page_attr+0x29/0x60
^M [c011b8b4] kernel_map_pages+0x84/0xa0
^M [c014cbca] store_stackinfo+0x5a/0x90
^M [c01664c8] __fput+0x108/0x180
^M [c018b59b] inotify_inode_queue_event+0x2b/0x40
^M [c021d97f] tty_release+0xf/0x20
^M [c016644a] __fput+0x8a/0x180
^M [c0164d7b] filp_close+0x4b/0x70
^M [c0125254] put_files_struct+0x74/0x100
^M [c012610c] do_exit+0x11c/0x420
^M [c012647d] do_group_exit+0x2d/0xa0
^M [c012f74c] get_signal_to_deliver+0x20c/0x310
^M [c0103deb] do_signal+0x5b/0x140
^M [c011ea89] __wake_up+0x29/0x40
^M [c021b60c] tty_ldisc_deref+0x3c/0x70
^M [c021c267] tty_read+0xc7/0x130
^M [c0243fb0] serport_ldisc_read+0x0/0x100
^M [c016ecd3] sys_fstat64+0x23/0x30
^M [c021c1a0] tty_read+0x0/0x130
^M [c0165547] vfs_read+0x97/0x140
^M [c016585c] sys_read+0x3c/0x70
^M [c0103efa] do_notify_resume+0x2a/0x40
^M [c01040be] work_notifysig+0x13/0x25
^MCode: e8 0f b6 c5 88 4b 4b 31 d2 c1 e9 10 88 43 4a 88 4b 49 89 d0 5b 
c3 8d b6 00 00 00 00 8d bf 00 00 00 00 8b 80 a8 09 00 00 8b
40 14 8b 50 70 85 d2 74 09 8b 52 10 85 d2 74 02 ff d2 c3 90 90 90 90
^M BUG: atomic counter underflow at:
^M [c0126386] do_exit+0x396/0x420
^M [c01059f6] die+0x166/0x170
^M [c011a7a3] do_page_fault+0x1f3/0x6a1
^M [c0244109] serport_ldisc_write_wakeup+0x9/0x20
^M [c011b36c] __change_page_attr+0x4c/0x3f0
^M [c011a5b0] do_page_fault+0x0/0x6a1
^M [c010522f] error_code+0x4f/0x60
^M [c0244100] serport_ldisc_write_wakeup+0x0/0x20
^M [c0244109] serport_ldisc_write_wakeup+0x9/0x20
^M [c021bb08] tty_wakeup+0x48/0x70
^M [c0245e4a] uart_close+0xca/0x1e0
^M [c021ceaa] release_dev+0x14a/0x750
^M [c011b739] change_page_attr+0x29/0x60
^M [c011b8b4] kernel_map_pages+0x84/0xa0
^M [c014cbca] store_stackinfo+0x5a/0x90
^M [c01664c8] __fput+0x108/0x180
^M [c018b59b] inotify_inode_queue_event+0x2b/0x40
^M [c021d97f] tty_release+0xf/0x20
^M [c016644a] __fput+0x8a/0x180
^M [c0164d7b] filp_close+0x4b/0x70
^M [c0125254] put_files_struct+0x74/0x100
^M [c012610c] do_exit+0x11c/0x420
^M [c012647d] do_group_exit+0x2d/0xa0
^M [c012f74c] get_signal_to_deliver+0x20c/0x310
^M [c0103deb] do_signal+0x5b/0x140
^M [c011ea89] 

Re: 2.6.12-rc1-mm3

2005-03-28 Thread Reuben Farrelly
Reuben Farrelly wrote:
I'm repeatably getting this crash on shutdown in -mm3, and a few 
releases earlier (but I can't be certain it was the same crash..)

Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing enabled
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
ttyS4 at I/O 0xa400 (irq = 16) is a 16550A
ttyS5 at I/O 0xa408 (irq = 16) is a 16550A
This _may_ be the culprit, but I'm not sure:
03:03.0 Serial controller: Timedia Technology Co Ltd PCI2S550 (Dual 
16550 UART) (rev 01) (prog-if 02 [16550])
Subsystem: Timedia Technology Co Ltd: Unknown device 0002
Flags: stepping, medium devsel, IRQ 16
I/O ports at a400 [size=32]
Ugh.  I'm an idiot, that will teach me for having two sessions to boxes 
running at once.

Wrong info above, but the trace is still valid.
Correct info follows:
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS14 at I/O 0xb400 (irq = 217) is a 16550A
ttyS15 at I/O 0xb000 (irq = 217) is a 16550A
06:02.0 Serial controller: NetMos Technology PCI 9835 Multi-I/O 
Controller (rev
01) (prog-if 02 [16550])
Subsystem: LSI Logic / Symbios Logic 2S (16C550 UART)
Flags: medium devsel, IRQ 217
I/O ports at b400 [size=8]
I/O ports at b000 [size=8]
I/O ports at ac00 [size=8]
I/O ports at a800 [size=8]
I/O ports at a400 [size=8]
I/O ports at a000 [size=16]


The board is an Intel D925XCV.
Shutdown goes like this:   (yes, hyperterminal sucks for the ^M 
characters, sorry)
trace omitted
reuben
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.12-rc1-mm2

2005-03-24 Thread Reuben Farrelly
Hi,
Andrew Morton wrote:
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc1/2.6.12-rc1-mm2/
- Added David Miller's networking tree to the -mm lineup as bk-net.patch. 

- Added Herbert Xu's crypto development tree to the -mm lineup as
  bk-cryptodev.patch.
  -mm kernels now aggregate Linus's tree and 34 subsystem trees.  Usually
  they are pulled 3-4 hours before the release of the -mm kernel.  

  Usually it is possible to determine the latest cset from each tree by
  looking at the first couple of lines of the relevant patch in the
  broken-out/ directory.  Although sometimes it isn't there if I had to
  massage the diff.
- There may be an x86_64 problem here, although it works for me.  If it
  fails early in boot, try reverting
  x86_64-separate-amd-cmp-detection-from-hyper-threading.patch
- There's some work here on the recent USB PM resume bugs.  If you had
  problems there, please test and be sure to cc
  linux-usb-devel@lists.sourceforge.net in any reports.
- Some fixes for the recent DRM problems.
- Big DVB update
- md updates
- nfs4 server updates
- Lots more fixes
- Lots more bugs.
Fails to compile for me:
  CC [M]  fs/nfs/dir.o
  CC [M]  fs/nfs/inode.o
  CC [M]  fs/nfs/nfs4proc.o
fs/nfs/nfs4proc.c:2976: error: static declaration of 
'nfs4_file_inode_operations' follows non-static declaration
fs/nfs/nfs4_fs.h:179: error: previous declaration of 
'nfs4_file_inode_operations' was here
make[2]: *** [fs/nfs/nfs4proc.o] Error 1
make[1]: *** [fs/nfs] Error 2
make: *** [fs] Error 2

I needed to remove this line:
extern struct inode_operations nfs4_file_inode_operations;
from  fs/nfs/nfs4_fs.h.
Patch attached.
Reuben

--- fs/nfs/nfs4_fs.h2005-03-25 11:40:51.0 +1200
+++ fs/nfs/nfs4_fs.h2005-03-25 11:44:28.0 +1200
@@ -176,7 +176,6 @@
 
 extern struct dentry_operations nfs4_dentry_operations;
 extern struct inode_operations nfs4_dir_inode_operations;
-extern struct inode_operations nfs4_file_inode_operations;
 
 /* inode.c */
 extern ssize_t nfs4_getxattr(struct dentry *, const char *, void *, size_t);


Re: 2.6.12-rc1-mm2

2005-03-24 Thread Reuben Farrelly
Hi,
Andrew Morton wrote:
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc1/2.6.12-rc1-mm2/
- Added David Miller's networking tree to the -mm lineup as bk-net.patch. 

- Added Herbert Xu's crypto development tree to the -mm lineup as
  bk-cryptodev.patch.
  -mm kernels now aggregate Linus's tree and 34 subsystem trees.  Usually
  they are pulled 3-4 hours before the release of the -mm kernel.  

  Usually it is possible to determine the latest cset from each tree by
  looking at the first couple of lines of the relevant patch in the
  broken-out/ directory.  Although sometimes it isn't there if I had to
  massage the diff.
- There may be an x86_64 problem here, although it works for me.  If it
  fails early in boot, try reverting
  x86_64-separate-amd-cmp-detection-from-hyper-threading.patch
- There's some work here on the recent USB PM resume bugs.  If you had
  problems there, please test and be sure to cc
  linux-usb-devel@lists.sourceforge.net in any reports.
- Some fixes for the recent DRM problems.
- Big DVB update
- md updates
- nfs4 server updates
- Lots more fixes
- Lots more bugs.
Fails to compile for me:
  CC [M]  fs/nfs/dir.o
  CC [M]  fs/nfs/inode.o
  CC [M]  fs/nfs/nfs4proc.o
fs/nfs/nfs4proc.c:2976: error: static declaration of 
'nfs4_file_inode_operations' follows non-static declaration
fs/nfs/nfs4_fs.h:179: error: previous declaration of 
'nfs4_file_inode_operations' was here
make[2]: *** [fs/nfs/nfs4proc.o] Error 1
make[1]: *** [fs/nfs] Error 2
make: *** [fs] Error 2

I needed to remove this line:
extern struct inode_operations nfs4_file_inode_operations;
from  fs/nfs/nfs4_fs.h.
Patch attached.
Reuben

--- fs/nfs/nfs4_fs.h2005-03-25 11:40:51.0 +1200
+++ fs/nfs/nfs4_fs.h2005-03-25 11:44:28.0 +1200
@@ -176,7 +176,6 @@
 
 extern struct dentry_operations nfs4_dentry_operations;
 extern struct inode_operations nfs4_dir_inode_operations;
-extern struct inode_operations nfs4_file_inode_operations;
 
 /* inode.c */
 extern ssize_t nfs4_getxattr(struct dentry *, const char *, void *, size_t);


Re: 2.6.11-mm3

2005-03-12 Thread Reuben Farrelly
At 12:42 a.m. 13/03/2005, Andrew Morton wrote:
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11/2.6.11-mm3/
- A new version of the "acpi poweroff fix".  People who were having trouble
  with ACPI poweroff, please test and report.
- A very large update to the CFQ I/O scheduler.  Treat with caution, run
  benchmarks.  Remember that the I/O scheduler can be selected on a per-disk
  basis with
echo as > /sys/block/sda/queue/scheduler
echo deadline > /sys/block/sda/queue/scheduler
echo cfq > /sys/block/sda/queue/scheduler
- video-for-linux update

Ugh, NTFS is br0ken:
  CC [M]  fs/ntfs/attrib.o
fs/ntfs/attrib.c: In function 'ntfs_attr_make_non_resident':
fs/ntfs/attrib.c:1295: warning: implicit declaration of function 
'ntfs_cluster_alloc'
fs/ntfs/attrib.c:1296: error: 'DATA_ZONE' undeclared (first use in this 
function)
fs/ntfs/attrib.c:1296: error: (Each undeclared identifier is reported only once
fs/ntfs/attrib.c:1296: error: for each function it appears in.)
fs/ntfs/attrib.c:1296: warning: assignment makes pointer from integer 
without a cast
fs/ntfs/attrib.c:1435: warning: implicit declaration of function 
'flush_dcache_mft_record_page'
fs/ntfs/attrib.c:1436: warning: implicit declaration of function 
'mark_mft_record_dirty'
fs/ntfs/attrib.c:1443: warning: implicit declaration of function 
'mark_page_accessed'
fs/ntfs/attrib.c:1521: warning: implicit declaration of function 
'ntfs_cluster_free_from_rl'
make[2]: *** [fs/ntfs/attrib.o] Error 1
make[1]: *** [fs/ntfs] Error 2
make: *** [fs] Error 2

Compile goes through to completion fine if I back out bk-ntfs.patch.
Using gcc-4, but this problem did not exist in -mm2.
reuben
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.11-mm3

2005-03-12 Thread Reuben Farrelly
At 12:42 a.m. 13/03/2005, Andrew Morton wrote:
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11/2.6.11-mm3/
- A new version of the acpi poweroff fix.  People who were having trouble
  with ACPI poweroff, please test and report.
- A very large update to the CFQ I/O scheduler.  Treat with caution, run
  benchmarks.  Remember that the I/O scheduler can be selected on a per-disk
  basis with
echo as  /sys/block/sda/queue/scheduler
echo deadline  /sys/block/sda/queue/scheduler
echo cfq  /sys/block/sda/queue/scheduler
- video-for-linux update

Ugh, NTFS is br0ken:
  CC [M]  fs/ntfs/attrib.o
fs/ntfs/attrib.c: In function 'ntfs_attr_make_non_resident':
fs/ntfs/attrib.c:1295: warning: implicit declaration of function 
'ntfs_cluster_alloc'
fs/ntfs/attrib.c:1296: error: 'DATA_ZONE' undeclared (first use in this 
function)
fs/ntfs/attrib.c:1296: error: (Each undeclared identifier is reported only once
fs/ntfs/attrib.c:1296: error: for each function it appears in.)
fs/ntfs/attrib.c:1296: warning: assignment makes pointer from integer 
without a cast
fs/ntfs/attrib.c:1435: warning: implicit declaration of function 
'flush_dcache_mft_record_page'
fs/ntfs/attrib.c:1436: warning: implicit declaration of function 
'mark_mft_record_dirty'
fs/ntfs/attrib.c:1443: warning: implicit declaration of function 
'mark_page_accessed'
fs/ntfs/attrib.c:1521: warning: implicit declaration of function 
'ntfs_cluster_free_from_rl'
make[2]: *** [fs/ntfs/attrib.o] Error 1
make[1]: *** [fs/ntfs] Error 2
make: *** [fs] Error 2

Compile goes through to completion fine if I back out bk-ntfs.patch.
Using gcc-4, but this problem did not exist in -mm2.
reuben
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Breakage with raid in 2.6.11-rc1-mm1 [Regression in mm]

2005-01-16 Thread Reuben Farrelly
Hi,
Reuben Farrelly wrote:
At 12:58 a.m. 15/01/2005, Andrew Morton wrote:
Reuben Farrelly <[EMAIL PROTECTED]> wrote:
>
> Something seems to have broken with 2.6.11-rc1-mm1, which worked ok 
with
> 2.6.10-mm3.
>
> NET: Registered protocol family 17
> Starting balanced_irq
> BIOS EDD facility v0.16 2004-Jun-25, 2 devices found
> md: Autodetecting RAID arrays.
> md: autorun ...
> md: ... autorun DONE.

> Kernel panic - not syncing: VFS: Unable to mount root fs on 
unknown-block(0,0)
>
> The system is running 5 RAID-1 partitions, and md2 is the root as per
> grub.conf.  Problem seems to be that raid autodetection finds no raid
> partitions :(
>
> The two ST380013AS SATA drives are detected earlier in the boot, so 
I don't
> think that's the problem..

hm, the only raidy thing we have in there is the below.  Maybe you could
try reverting that?
--- 25/drivers/md/raid5.c~raid5-overlapping-read-hack   2005-01-09 
22:20:40.211246912 -0800
+++ 25-akpm/drivers/md/raid5.c  2005-01-09 22:20:40.216246152 -0800
@@ -232,6 +232,7 @@ static struct stripe_head *__find_stripe
 }

 static void unplug_slaves(mddev_t *mddev);
+static void raid5_unplug_device(request_queue_t *q);
 static struct stripe_head *get_active_stripe(raid5_conf_t *conf, 
sector_t sector,
 int pd_idx, int noblock)

Ok the breakage occurred somewhere between 2.6.10-mm3 (works) and 
2.6.11-rc1 (doesn't work) ie wasn't introduced into the latest -mm 
patchset as I first thought.

Are there any other patches that might be worth a try backing out?
reuben
I did a full untar of the source and rebuilt my (crusty old) config file
from scratch, and it seems to have come right now.  Can't really explain
it though...but obviously wasn't a problem with the -mm release as I
first though.  Now running -rc1-mm1 with no problems and no other patches.
Thanks to those who helped on what turned out to be a false alarm.
reuben

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Breakage with raid in 2.6.11-rc1-mm1 [Regression in mm]

2005-01-16 Thread Reuben Farrelly
Hi,
Reuben Farrelly wrote:
At 12:58 a.m. 15/01/2005, Andrew Morton wrote:
Reuben Farrelly [EMAIL PROTECTED] wrote:

 Something seems to have broken with 2.6.11-rc1-mm1, which worked ok 
with
 2.6.10-mm3.

 NET: Registered protocol family 17
 Starting balanced_irq
 BIOS EDD facility v0.16 2004-Jun-25, 2 devices found
 md: Autodetecting RAID arrays.
 md: autorun ...
 md: ... autorun DONE.
snip
 Kernel panic - not syncing: VFS: Unable to mount root fs on 
unknown-block(0,0)

 The system is running 5 RAID-1 partitions, and md2 is the root as per
 grub.conf.  Problem seems to be that raid autodetection finds no raid
 partitions :(

 The two ST380013AS SATA drives are detected earlier in the boot, so 
I don't
 think that's the problem..

hm, the only raidy thing we have in there is the below.  Maybe you could
try reverting that?
--- 25/drivers/md/raid5.c~raid5-overlapping-read-hack   2005-01-09 
22:20:40.211246912 -0800
+++ 25-akpm/drivers/md/raid5.c  2005-01-09 22:20:40.216246152 -0800
@@ -232,6 +232,7 @@ static struct stripe_head *__find_stripe
 }

 static void unplug_slaves(mddev_t *mddev);
+static void raid5_unplug_device(request_queue_t *q);
 static struct stripe_head *get_active_stripe(raid5_conf_t *conf, 
sector_t sector,
 int pd_idx, int noblock)

Ok the breakage occurred somewhere between 2.6.10-mm3 (works) and 
2.6.11-rc1 (doesn't work) ie wasn't introduced into the latest -mm 
patchset as I first thought.

Are there any other patches that might be worth a try backing out?
reuben
I did a full untar of the source and rebuilt my (crusty old) config file
from scratch, and it seems to have come right now.  Can't really explain
it though...but obviously wasn't a problem with the -mm release as I
first though.  Now running -rc1-mm1 with no problems and no other patches.
Thanks to those who helped on what turned out to be a false alarm.
reuben

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/