Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-10 Thread Bob Tracy
Ivan Kokshaysky wrote:
> On Sat, Dec 08, 2007 at 10:19:39PM -0600, Bob Tracy wrote:
> > I *do* have CONFIG_MAGIC_SYSRQ set.  Anyone care to bet whether my
> > machine starts working again if I disable it?  Sheesh...
> 
> Incredible...
> 
> Toggling CONFIG_MAGIC_SYSRQ works for me too, so I'm finally able
> to reproduce the problem (which is the main positive result so far ;-)
> 
> There are lots of possible reasons why this happens, but at the
> moment I honestly have no idea.
> For now I have reassigned the bug #9457 to myself and will gradually hack
> into udev...

Thanks...  Let me know if there's anything useful I can do to help.

--Bob T.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-10 Thread Bob Tracy
Kay Sievers wrote:
> On Fri, 2007-12-07 at 23:05 -0600, Bob Tracy wrote:
> > Kay Sievers wrote:
> > > Is the udev daemon (still) running while it fails?
> > 
> > Yes, and there's something else I forgot to mention that may be
> > significant...  For the bad case, in addition to udevd, "ps -ef"
> > shows a "sh -e /lib/udev/net.agent" running with a PPID of 1.  This
> > process doesn't exit until I reboot.  If this is normal under the
> > circumstances, please disregard.
> 
> Does SysRq-T show where it hangs?

A quick comparison of the trace sections for udevd and net.agent indicates
those traces are identical: none of the function names in the traces appear
to be what you might be looking for, i.e., the processes appear to have been
waiting for an event of some kind, and woke up long enough to process the
SysRq-T keyboard interrupt and the corresponding action.

Hmm...  Ok...  The state information itself is probably more useful in
this context.  Here's the info for net.agent:

net.agent  S  fc32c37c   08951
  fc743b10  0010  fc4f3b5c  fc7601a8
0001  0074  fc747758  fc00230f
0007  0007  fc4f390c  0010
fc4e9eb8  fc00230f  0014  fc0023085140
0001  0014  fc1de000  0001
fc55dcfc  fc0023085140  fc00232788c0  0001

Addresses of presumed interest from System.map:

fc32c000 t do_wait
fc74 D init_thread_union
fc4f3b40 t sysrq_handle_showstate
fc7601a8 d sysrq_showstate_op
fc747758 D console_printk
fc002...  not in System.map (module?) -- I'll track this down later
   if needed.
fc4f3850 T __handle_sysrq
fc4e9850 t kbd_event
fc1...  not in System.map (?? begins with fc30 A 
swapper_pg_dir)
fc55dc30 t input_pass_event

--Bob T.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-08 Thread Bob Tracy
Michael Cree wrote:
> Kay Sievers wrote:
> > On Fri, 2007-12-07 at 23:05 -0600, Bob Tracy wrote:
> >> Kay Sievers wrote:
> >>> Is the udev daemon (still) running while it fails?
> >> Yes, and there's something else I forgot to mention that may be
> >> significant...  For the bad case, in addition to udevd, "ps -ef"
> >> shows a "sh -e /lib/udev/net.agent" running with a PPID of 1.  This
> >> process doesn't exit until I reboot.  If this is normal under the
> >> circumstances, please disregard.
> > 
> > Does SysRq-T show where it hangs?
> 
> Ummm... No.  I didn't have the CONFIG_MAGIC_SYSRQ flag set, so I set it, 
> and recompiled the kernel.  Guess what - now the system comes up 
> normally without any problem.  The block devices appear in /dev.  To 
> recap: without CONFIG_MAGIC_SYSRQ on the 2.6.24-rc3 kernel the missing 
> block devices error in /dev occurs and the init scripts fall over on 
> startup, and with CONFIG_MAGIC_SYSRQ the system comes up normally.

I *do* have CONFIG_MAGIC_SYSRQ set.  Anyone care to bet whether my
machine starts working again if I disable it?  Sheesh...  The "kernel
alignment issue" theory is making sense...  We change the size of an
initialized variable with the patch, and the problem shows up.  We
shift starting addresses a different way by tweaking kernel options,
and two wrongs make a right?  I've seen it happen, and tracking this
down isn't going to be easy.  Anyone want to wade through the different
System.map files and hazard a guess where we're leaving the rails?

Here's a very brief diff excerpt between the System.map files corresponding
to "sysctl_check patch reverted" (the -dirty version) and "with sysctl_check 
patch".
At least they agree up to line 10870 :-) ...

--- /boot/System.map-2.6.24-rc2-g6f37ac79-dirty 2007-12-07 08:03:50.0 -0
600
+++ System.map  2007-12-07 13:43:37.0 -0600
@@ -10868,9414 +10868,9414 @@
 fc684b00 R kallsyms_markers
 fc684d00 R kallsyms_token_table
 fc685100 R kallsyms_token_index
-fc6f61e0 r 
__pci_fixup_PCI_VENDOR_ID_SERVERWORKSPCI_DEVICE_ID_SERVERWORKS_CSB5IDEquirk_svwks_csb5ide
-fc6f61e0 R __start_pci_fixups_early
-fc6f61f0 r 
__pci_fixup_PCI_VENDOR_ID_INTELPCI_DEVICE_ID_INTEL_82801CA_10quirk_ide_samemode
(...)
-fc716120 r __param_bic_scale
-fc716148 r __param_tcp_friendliness
-fc716170 R __end_rodata
-fc716170 R __stop___param
+fc6f61f0 r 
__pci_fixup_PCI_VENDOR_ID_SERVERWORKSPCI_DEVICE_ID_SERVERWORKS_CSB5IDEquirk_svwks_csb5ide
+fc6f61f0 R __start_pci_fixups_early
+fc6f6200 r 
__pci_fixup_PCI_VENDOR_ID_INTELPCI_DEVICE_ID_INTEL_82801CA_10quirk_ide_samemode
(...)
+fc716130 r __param_bic_scale
+fc716158 r __param_tcp_friendliness
+fc716180 R __end_rodata
+fc716180 R __stop___param
 fc718000 A __init_begin
 fc718000 T _sinittext
 fc718000 t set_reset_devices

> When running the broken kernel udev is running (according to 'ps') and 
> executing /sbin/udevtrigger manually generates a number of errors of the 
> form:
> 
> scsi_id[]: scsi_id: unable to access '/block'
> 
> The missing /dev/* entries do not appear.

I don't get the errors that Michael is seeing, and udevtrigger seems to
be exiting without errors (return code 0).  The last part is the same:
the missing /dev/* entries do not appear.

--Bob T.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-07 Thread Bob Tracy
Kay Sievers wrote:
> Is the udev daemon (still) running while it fails?

Yes, and there's something else I forgot to mention that may be
significant...  For the bad case, in addition to udevd, "ps -ef"
shows a "sh -e /lib/udev/net.agent" running with a PPID of 1.  This
process doesn't exit until I reboot.  If this is normal under the
circumstances, please disregard.

-- 
--------
Bob Tracy  |  "They couldn't hit an elephant at this dist- "
[EMAIL PROTECTED]   |   - Last words of Union General John Sedgwick,
   |  Battle of Spotsylvania Court House, U.S. Civil War

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-07 Thread Bob Tracy
Kay Sievers wrote:
> Is the udev daemon (still) running while it fails?

Yes.

> If you run /sbin/udevtrigger, do the nodes appear?

No.  Exit status is 0, and there are no errors.  Everything looks
fine under /sys/block, and there doesn't seem to be a problem with
/proc/devices either.

-- 
--------
Bob Tracy  |  "They couldn't hit an elephant at this dist- "
[EMAIL PROTECTED]   |   - Last words of Union General John Sedgwick,
   |  Battle of Spotsylvania Court House, U.S. Civil War

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-07 Thread Bob Tracy
Kay Sievers wrote:
> Yeah, that looks all fine.
> 
> What distro is that, and what's the udev version?

Mine is Debian Etch, normally with the latest released or -rcX kernel
from kernel.org.  Updates current as of about 18 hours ago.  Udev
package version is 0.105-4.  The RELEASE-NOTES file in /usr/share/doc/udev
says "udev 105".

> You are booting your kernel with an initramfs?

Not in my case: everything I need at boot time is built-in.

> Is the udev daemon (still) running while it fails?
> 
> If you run /sbin/udevtrigger, do the nodes appear?

I can answer the above later when I'm back in front of the machine, but
even in the "not good" case, I still see the following messages from
the /etc/rcS.d/S03udev file:

Starting the hotplug events dispatcher udevd.
Synthesizing the initial hotplug events.

This is where udevtrigger gets called, followed by the load_input_modules
and create_dev_makedev functions, then...

Waiting for /dev to be fully populated.

which is where udevsettle gets called.

None of the above appear to be exiting abnormally for the bad case, but
I'll definitely take a closer look at what MAKEDEV (/dev/MAKEDEV -->
/sbin/MAKEDEV) is doing.  In particular, Debian MAKEDEV is looking at
/proc/devices to decide what to do, so maybe "cat /proc/devices" would
be useful to look at for the broken case.

-- 
--------
Bob Tracy  |  "They couldn't hit an elephant at this dist- "
[EMAIL PROTECTED]   |   - Last words of Union General John Sedgwick,
   |  Battle of Spotsylvania Court House, U.S. Civil War

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-07 Thread Bob Tracy
Kay Sievers wrote:
> On Fri, 2007-12-07 at 19:06 +0100, Ingo Molnar wrote:
> > i'm not sure how to do direct debugging on udev, so i can only guess 
> > about what effect on the kernel side could have caused this. One bad 
> > hack would be to "probe" udevd's behavior by changing the NET_TR entry 
> > in various ways:
> > 
> >   "tr" -> "token-ring" # breaks
> >   "tr" -> "tr" # works
> >   "tr" -> "token-rin0" # ?(1)
> >   "tr" -> "TR" # ?(2)
> > 
> > the question is, does tweak (1) and tweak (2) work or break?
> > 
> > but it would be a lot more effective i guess to get some udevd expert's 
> > attention on this ...
> 
> Could we get the output of:
>   ls -l /sys/block/sda/
> and:
>   grep . /sys/block/sda/*/dev
> ?

Here are the requested items for the 2.6.24-rc2-g6f37ac79-dirty kernel
(the working one with the sysctl_check.c patch reverted):

smirkin:/# ls -l /sys/block/sda
total 0
-r--r--r-- 1 root root 8192 Dec  7 08:36 capability
-r--r--r-- 1 root root 8192 Dec  7 08:36 dev
lrwxrwxrwx 1 root root0 Dec  7 08:36 device -> 
../../devices/pci:00/:00:14.0/:01:09.0/host0/target0:0:0/0:0:0:0
drwxr-xr-x 2 root root0 Dec  7 08:36 holders
drwxr-xr-x 3 root root0 Dec  7 08:36 queue
-r--r--r-- 1 root root 8192 Dec  7 08:36 range
-r--r--r-- 1 root root 8192 Dec  7 08:36 removable
drwxr-xr-x 3 root root0 Dec  7 08:36 sda1
drwxr-xr-x 3 root root0 Dec  7 08:36 sda2
drwxr-xr-x 3 root root0 Dec  7 08:36 sda3
drwxr-xr-x 3 root root0 Dec  7 08:36 sda4
drwxr-xr-x 3 root root0 Dec  7 08:36 sda5
drwxr-xr-x 3 root root0 Dec  7 08:36 sda6
drwxr-xr-x 3 root root0 Dec  7 08:36 sda7
-r--r--r-- 1 root root 8192 Dec  7 08:36 size
drwxr-xr-x 2 root root0 Dec  7 08:36 slaves
-r--r--r-- 1 root root 8192 Dec  7 08:36 stat
lrwxrwxrwx 1 root root0 Dec  7 08:36 subsystem -> ../../block
--w--- 1 root root 8192 Dec  7 08:36 uevent
smirkin:/# grep . /sys/block/sda/*/dev
/sys/block/sda/sda1/dev:8:1
/sys/block/sda/sda2/dev:8:2
/sys/block/sda/sda3/dev:8:3
/sys/block/sda/sda4/dev:8:4
/sys/block/sda/sda5/dev:8:5
/sys/block/sda/sda6/dev:8:6
/sys/block/sda/sda7/dev:8:7

Assuming /sys/block even exists for the non-working case, I'll forward
that info in a few hours when I can get home to reboot the machine.

-- 

Bob Tracy  |  "They couldn't hit an elephant at this dist- "
[EMAIL PROTECTED]   |   - Last words of Union General John Sedgwick,
   |  Battle of Spotsylvania Court House, U.S. Civil War

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-07 Thread Bob Tracy
Ingo Molnar wrote:
> 
> * Bob Tracy <[EMAIL PROTECTED]> wrote:
> 
> > > Current state of the source tree is the 6f37ac... version, so I'll 
> > > start backing out the above diffs in related groups and continue 
> > > until I've got a working kernel.  For lack of an obvious target, 
> > > I'll start with the seemingly innocuous change to sysctl_check.c.  
> > > I'll report back when I've got something.
> > 
> > That was quick :-).  Backing out the sysctl_check.c diff gives me a 
> > working kernel.  Beats the [EMAIL PROTECTED] out of me how/why, though.
> > 
> > Michael Cree: could you try backing out the diff below from your 
> > 2.6.24-rc3 tree and see if things are now working for you?
> > 
> > Here's "uname -a", just to confirm (maybe) I'm running on what I say 
> > works:
> > 
> > Linux smirkin 2.6.24-rc2-g6f37ac79-dirty #2 Fri Dec 7 08:03:12 CST 2007 
> > alpha
> > 
> > Here's the diff I backed out (patch -R).  It's short...
> > 
> > diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c
> > index 5a2f2b2..4abc6d2 100644
> > --- a/kernel/sysctl_check.c
> > +++ b/kernel/sysctl_check.c
> > @@ -738,7 +738,7 @@ static struct trans_ctl_table trans_net_table[] = {
> > { NET_ROSE, "rose", trans_net_rose_table },
> > { NET_IPV6, "ipv6", trans_net_ipv6_table },
> > { NET_X25,  "x25",  trans_net_x25_table },
> > -   { NET_TR,   "tr",   trans_net_tr_table },
> > +   { NET_TR,   "token-ring",   trans_net_tr_table },
> > { NET_DECNET,   "decnet",   trans_net_decnet_table },
> > /*  NET_ECONET not used */
> > { NET_SCTP, "sctp", trans_net_sctp_table },
> 
> reverting this makes the kernel image shorter by 8 bytes - so perhaps 
> some alignment issue somewhere? Or something gets overflown? Does any of 
> this get actually used by your bootup?

Dunno...  The dmesg output is not terribly useful here, because most of
the "interesting" stuff concerning udev startup that appears on the
console never makes it into a log.  Note that, for the bad cases, I
don't see the same console output that Michael reported, although the
net effect is the same: the partitions don't get found, so I'm offered
the chance to enter my root password and do some poking around, and
when I do, none of the block devices are present under /dev.

I'm open to suggestions on how to take this analysis further.  Michael
indicated he's running a conference this week, so I don't know when he'll
be able to come up for air.

-- 

Bob Tracy  |  "They couldn't hit an elephant at this dist- "
[EMAIL PROTECTED]   |   - Last words of Union General John Sedgwick,
   |  Battle of Spotsylvania Court House, U.S. Civil War

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-07 Thread Bob Tracy
I wrote:
> "git diff 2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3 
> 6f37ac793d6ba7b35d338f791974166f67fdd9ba"
> produced a relatively short patch (18,437 bytes).  The list of involved
> files:
> 
> (omitted)
>
> Current state of the source tree is the 6f37ac... version, so I'll start
> backing out the above diffs in related groups and continue until I've got
> a working kernel.  For lack of an obvious target, I'll start with the
> seemingly innocuous change to sysctl_check.c.  I'll report back when I've
> got something.

That was quick :-).  Backing out the sysctl_check.c diff gives me a
working kernel.  Beats the [EMAIL PROTECTED] out of me how/why, though.

Michael Cree: could you try backing out the diff below from your
2.6.24-rc3 tree and see if things are now working for you?

Here's "uname -a", just to confirm (maybe) I'm running on what I say
works:

Linux smirkin 2.6.24-rc2-g6f37ac79-dirty #2 Fri Dec 7 08:03:12 CST 2007 alpha

Here's the diff I backed out (patch -R).  It's short...

diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c
index 5a2f2b2..4abc6d2 100644
--- a/kernel/sysctl_check.c
+++ b/kernel/sysctl_check.c
@@ -738,7 +738,7 @@ static struct trans_ctl_table trans_net_table[] = {
{ NET_ROSE, "rose", trans_net_rose_table },
{ NET_IPV6, "ipv6", trans_net_ipv6_table },
{ NET_X25,  "x25",  trans_net_x25_table },
-   { NET_TR,   "tr",   trans_net_tr_table },
+   { NET_TR,   "token-ring",   trans_net_tr_table },
{ NET_DECNET,   "decnet",   trans_net_decnet_table },
/*  NET_ECONET not used */
{ NET_SCTP, "sctp", trans_net_sctp_table },

-- 

Bob Tracy  |  "They couldn't hit an elephant at this dist- "
[EMAIL PROTECTED]   |   - Last words of Union General John Sedgwick,
   |  Battle of Spotsylvania Court House, U.S. Civil War

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-07 Thread Bob Tracy
Andrew Morton wrote:
> On Thu, 6 Dec 2007 23:07:08 -0600 (CST) [EMAIL PROTECTED] (Bob Tracy) wrote:
> > Andrew Morton wrote:
> > > commit 6f37ac793d6ba7b35d338f791974166f67fdd9ba
> > > Merge: 2f1f53b... d90bf5a...
> > > Author: Linus Torvalds <[EMAIL PROTECTED]>
> > > Date:   Wed Nov 14 18:51:48 2007 -0800
> > > 
> > > Merge branch 'master' of 
> > > master.kernel.org:/pub/scm/linux/kernel/git/davem/n
> > > 
> > > * 'master' of 
> > > master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
> > >   (omitted for brevity)
> > > 
> > > I'm struggling to see how any of those could have broken block device
> > > mounting on alpha.  Are you sure you bisected right?
> > 
> > Based on what's in that commit, it *does* appear something went wrong
> > with bisection.  If the implicated commit is the next one in time
> > sequence relative to
> > 
> > # good: [2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3] CRISv10 fasttimer: Scrap 
> > INLINE and name timeval_cmp better
> > 
> > then the test of whether I bisected correctly is as simple as applying
> > the commit and seeing if things break, because I'm running on the
> > kernel corresponding to 2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3 right
> > now.  Let me give that a try and I'll report back.  Worst case, I'll
> > have to start over and write off the past four days...
> 
> Gad.  I trust the second time will be faster.
> 
> git-bisect _is_ very error prone.  I find one of the problems is that each
> step is so far apart in time that you forget what you were doing.  Did I
> remember to test that iteration?  Did I install the right kernel?  etc.
> 
> > Sorry about this...
> 
> Not appropriate ;)   Thanks for helping out.

Thanks for the kind words...  The above-mentioned test verified that the
bisection was/is correct: 2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3 works,
and 6f37ac793d6ba7b35d338f791974166f67fdd9ba doesn't.  Now I've got to
figure out why.

"git diff 2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3 
6f37ac793d6ba7b35d338f791974166f67fdd9ba"
produced a relatively short patch (18,437 bytes).  The list of involved
files:

diff --git a/drivers/char/random.c b/drivers/char/random.c
diff --git a/drivers/isdn/sc/card.h b/drivers/isdn/sc/card.h
diff --git a/drivers/isdn/sc/packet.c b/drivers/isdn/sc/packet.c
diff --git a/drivers/isdn/sc/shmem.c b/drivers/isdn/sc/shmem.c
diff --git a/drivers/net/arm/ep93xx_eth.c b/drivers/net/arm/ep93xx_eth.c
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
diff --git a/drivers/net/fs_enet/Kconfig b/drivers/net/fs_enet/Kconfig
diff --git a/drivers/net/fs_enet/Makefile b/drivers/net/fs_enet/Makefile
diff --git a/drivers/net/netx-eth.c b/drivers/net/netx-eth.c
diff --git a/drivers/net/s2io.c b/drivers/net/s2io.c
diff --git a/drivers/net/wireless/iwlwifi/iwl3945-base.c 
b/drivers/net/wireless/iwlwifi/iwl3945-base.c
diff --git a/include/net/sock.h b/include/net/sock.h
diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c
diff --git a/net/core/dev.c b/net/core/dev.c
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c

Current state of the source tree is the 6f37ac... version, so I'll start
backing out the above diffs in related groups and continue until I've got
a working kernel.  For lack of an obvious target, I'll start with the
seemingly innocuous change to sysctl_check.c.  I'll report back when I've
got something.

-- 

Bob Tracy  |  "They couldn't hit an elephant at this dist- "
[EMAIL PROTECTED]   |   - Last words of Union General John Sedgwick,
   |  Battle of Spotsylvania Court House, U.S. Civil War

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-06 Thread Bob Tracy
I wrote:
> If the implicated commit is the next one in time
> sequence relative to
> 
> # good: [2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3] CRISv10 fasttimer: Scrap 
> INLINE and name timeval_cmp better
> 
> then the test of whether I bisected correctly is as simple as applying
> the commit and seeing if things break, because I'm running on the
> kernel corresponding to 2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3 right
> now.  Let me give that a try and I'll report back.

Verified that 6f37ac793d6ba7b35d338f791974166f67fdd9ba is the next
commit after the "good" kernel I'm running now.  The build is running,
and I should have an answer for us in a few hours.

-- 
--------
Bob Tracy  |  "They couldn't hit an elephant at this dist- "
[EMAIL PROTECTED]   |   - Last words of Union General John Sedgwick,
   |  Battle of Spotsylvania Court House, U.S. Civil War

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-06 Thread Bob Tracy
Andrew Morton wrote:
> commit 6f37ac793d6ba7b35d338f791974166f67fdd9ba
> Merge: 2f1f53b... d90bf5a...
> Author: Linus Torvalds <[EMAIL PROTECTED]>
> Date:   Wed Nov 14 18:51:48 2007 -0800
> 
> Merge branch 'master' of 
> master.kernel.org:/pub/scm/linux/kernel/git/davem/n
> 
> * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
>   [NET]: rt_check_expire() can take a long time, add a cond_resched()
>   [ISDN] sc: Really, really fix warning
>   [ISDN] sc: Fix sndpkt to have the correct number of arguments
>   [TCP] FRTO: Clear frto_highmark only after process_frto that uses it
>   [NET]: Remove notifier block from chain when 
> register_netdevice_notifier f
>   [FS_ENET]: Fix module build.
>   [TCP]: Make sure write_queue_from does not begin with NULL ptr
>   [TCP]: Fix size calculation in sk_stream_alloc_pskb
>   [S2IO]: Fixed memory leak when MSI-X vector allocation fails
>   [BONDING]: Fix resource use after free
>   [SYSCTL]: Fix warning for token-ring from sysctl checker
>   [NET] random : secure_tcp_sequence_number should not assume 
> CONFIG_KTIME_S
>   [IWLWIFI]: Not correctly dealing with hotunplug.
>   [TCP] FRTO: Plug potential LOST-bit leak
>   [TCP] FRTO: Limit snd_cwnd if TCP was application limited
>   [E1000]: Fix schedule while atomic when called from mii-tool.
>   [NETX]: Fix build failure added by 2.6.24 statistics cleanup.
>   [EP93xx_ETH]: Build fix after 2.6.24 NAPI changes.
>   [PKT_SCHED]: Check subqueue status before calling hard_start_xmit
> 
> I'm struggling to see how any of those could have broken block device
> mounting on alpha.  Are you sure you bisected right?

Based on what's in that commit, it *does* appear something went wrong
with bisection.  If the implicated commit is the next one in time
sequence relative to

# good: [2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3] CRISv10 fasttimer: Scrap 
INLINE and name timeval_cmp better

then the test of whether I bisected correctly is as simple as applying
the commit and seeing if things break, because I'm running on the
kernel corresponding to 2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3 right
now.  Let me give that a try and I'll report back.  Worst case, I'll
have to start over and write off the past four days...

Sorry about this...

-- 

Bob Tracy  |  "They couldn't hit an elephant at this dist- "
[EMAIL PROTECTED]   |   - Last words of Union General John Sedgwick,
   |  Battle of Spotsylvania Court House, U.S. Civil War

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-06 Thread Bob Tracy
OK.  Finally have this thing painted into a corner: git has identified
6f37ac793d6ba7b35d338f791974166f67fdd9ba as the first bad commit.

>From "git bisect log", this corresponds to 

# bad: [6f37ac793d6ba7b35d338f791974166f67fdd9ba] Merge branch 'master' of 
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6

Here's the full log:

git-bisect start
# good: [9aae299f7fd1888ea3a195cfe0edef17bb647415] Linux 2.6.24-rc2
git-bisect good 9aae299f7fd1888ea3a195cfe0edef17bb647415
# bad: [f05092637dc0d9a3f2249c9b283b973e6e96b7d2] Linux 2.6.24-rc3
git-bisect bad f05092637dc0d9a3f2249c9b283b973e6e96b7d2
# good: [e6a5c27f3b0fef72e528fc35e343af4b2db790ff] Merge branch 'for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
git-bisect good e6a5c27f3b0fef72e528fc35e343af4b2db790ff
# good: [42614fcde7bfdcbe43a7b17035c167dfebc354dd] vmstat: fix section mismatch 
warning
git-bisect good 42614fcde7bfdcbe43a7b17035c167dfebc354dd
# bad: [a052f4473603765eb6b4c19754689977601dc1d1] Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/sam/x86
git-bisect bad a052f4473603765eb6b4c19754689977601dc1d1
# good: [d8e5219f9f5ca7518eb820db9f3d287a1d46fcf5] CRISv10 improve and bugfix 
fasttimer
git-bisect good d8e5219f9f5ca7518eb820db9f3d287a1d46fcf5
# good: [d90bf5a976793edfa88d3bb2393f0231eb8ce1e5] [NET]: rt_check_expire() can 
take a long time, add a cond_resched()
git-bisect good d90bf5a976793edfa88d3bb2393f0231eb8ce1e5
# good: [2a113281f5cd2febbab21a93c8943f8d3eece4d3] kconfig: use $K64BIT to set 
64BIT with all*config targets
git-bisect good 2a113281f5cd2febbab21a93c8943f8d3eece4d3
# good: [2e2cd8bad6e03ceea73495ee6d557044213d95de] CRISv10 memset library add 
lineendings to asm
git-bisect good 2e2cd8bad6e03ceea73495ee6d557044213d95de
# bad: [6f37ac793d6ba7b35d338f791974166f67fdd9ba] Merge branch 'master' of 
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
git-bisect bad 6f37ac793d6ba7b35d338f791974166f67fdd9ba
# good: [2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3] CRISv10 fasttimer: Scrap 
INLINE and name timeval_cmp better
git-bisect good 2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3

-- 
--------
Bob Tracy  |  "They couldn't hit an elephant at this dist- "
[EMAIL PROTECTED]   |   - Last words of Union General John Sedgwick,
   |  Battle of Spotsylvania Court House, U.S. Civil War

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-05 Thread Bob Tracy
Current progress: 11 revisions left to test.  The current partial
"git bisect log" is available per Ingo's suggestion on bugzilla.

http://bugzilla.kernel.org/show_bug.cgi?id=9457

-- 
--------
Bob Tracy  |  "They couldn't hit an elephant at this dist- "
[EMAIL PROTECTED]   |   - Last words of Union General John Sedgwick,
   |  Battle of Spotsylvania Court House, U.S. Civil War

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-04 Thread Bob Tracy
Ingo Molnar wrote:
> once you are done with the download of the initial cloned git repository 
> (which is 200MB+), all the bisection steps will be local and you'll be 
> only limited by kernel rebuild speed and by bootup and testing speed, 
> not by network bandwidth.

ACK.  Have tested two kernels in the past 24 hours, and the third is
building as I type this.  The builds seem to be taking about 3 hours
each.  First two tests good, so the offending commit is somewhere in
the last 25% (roughly) of the changes between -rc2 and -rc3: git says
82 revisions left to test.  Might have this painted into a corner in
the next day or so.  I'll try to be quick about it, since -rc4 is out.

> ( once you have the cloned repository i'd suggest for you to keep it - 
>   that way you can track susequent kernels via "git-pull" and it uses a 
>   very network-efficient delta protocol. )

Will do...  I'm in the fortunate position of having enough disk space
on my Alpha that I can maintain multiple trees for this kind of effort.

-- 
--------
Bob Tracy  |  "They couldn't hit an elephant at this dist- "
[EMAIL PROTECTED]   |   - Last words of Union General John Sedgwick,
   |  Battle of Spotsylvania Court House, U.S. Civil War

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-02 Thread Bob Tracy
Michael Cree wrote:
> On 1/12/2007, at 11:42 AM, Andrew Morton wrote:
> > On Sat, 01 Dec 2007 11:30:01 +1300
> > Michael Cree <[EMAIL PROTECTED]> wrote:
> >
> >> Bob Tracy wrote:
> >>>  Here's
> >>> hoping someone else is seeing this or can replicate it in the  
> >>> meantime.
> >>
> >> Snap.
> >>
> >> 2.6.24-rc2 works fine.   2.6.24-rc3 boots on Alpha but once /dev is
> >> populated no partitions of the scsi sub-system are seen.  Looks  
> >> like ide sub-system similarly affected.
> 
> [snip]
> 
> >> eth0: Digital DS21142/43 Tulip rev 65 at Port 0x29400,
> >> 08:00:2b:87:4c:b0, IRQ 45.
> >> Linux video capture interface: v2.00
> >> scsi_id[402]: scsi_id: unable to access '/block'
> >
> > I guess this is where things go bad.
> 
> Yes, that is what I thought too.

Thanks for the confirmation of the error condition.  As best I can
recall, your boot log is substantially the same as what I saw.

Finally got back in town.  Starting the git-bisect process.  I've got
a relatively slow network connection, and the PWS 433au isn't exactly
what I would call "fast" by modern standards, so bear with me while I
get things set up and crank through this.  The clone of the 2.6 tree
will take several more hours to finish downloading.  I anticipate the
best pace I'll be able to manage after that is two iterations in a 24-
hour period.

-- 

Bob Tracy  |  "They couldn't hit an elephant at this dist- "
[EMAIL PROTECTED]   |   - Last words of Union General John Sedgwick,
   |  Battle of Spotsylvania Court House, U.S. Civil War

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] aic7xxx/aicasm build failure w/gcc-3.4.6

2007-05-22 Thread Bob Tracy
James Bottomley wrote:
> We really don't want gcc making assumptions about prototypes ... even if
> it's getting them right in all likelihood (doubtless unprototyped
> assumed functions will become a warning and then an error in later gcc
> versions ...), so this is a better fix

ACK.  The fix works here.  If you would be so kind, please push it
upstream at your convenience.

gcc-4.X violates the principle of least astonishment over even more
nitnoid matters, but that's another flame for another day.

-- 
-------
Bob Tracy   | "Eagles may soar, but weasels don't get
[EMAIL PROTECTED]|  sucked into jet engines."   --Anon
---
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] aic7xxx/aicasm build failure w/gcc-3.4.6

2007-05-22 Thread Bob Tracy
Second try: originally reported this back on April 17th.  2.6.X
kernel builds started failing after I upgraded my compiler from
gcc-3.3.X to gcc-3.4.6:

make -C drivers/scsi/aic7xxx/aicasm
(...)
gcc -I/usr/include -I. aicasm.c aicasm_symbol.c aicasm_gram.c 
aicasm_macro_gram.c aicasm_scan.c aicasm_macro_scan.c -o aicasm -ldb
aicasm_gram.y:1948: error: conflicting types for 'yyerror'
aicasm_gram.tab.c:3004: error: previous implicit declaration of 'yyerror' was 
here
aicasm_macro_gram.y:162: error: conflicting types for 'mmerror'
aicasm_macro_gram.tab.c:1196: error: previous implicit declaration of 'mmerror' 
was here

As a workaround, commenting out or deleting the "void" declarations
for yyerror() and mmerror() in the respective ".y" files fixes the
problem.  A patch to illustrate the offending code is attached, but
there's no "signed-off by" line because I'm certain the final form of
the patch will be different.  The patch applies cleanly to at least
2.6.21 and later kernels.  gcc-3.3 may have been warning about the
type conflicts, but I didn't notice: gcc-3.4 treats the type conflicts
as errors, so I *did* notice :-).

Here's the "gcc -v" output:

Reading specs from /usr/lib/gcc/i486-slackware-linux/3.4.6/specs
Configured with: ../gcc-3.4.6/configure --prefix=/usr --enable-shared 
--enable-threads=posix --enable-__cxa_atexit --disable-checking --with-gnu-ld 
--verbose --target=i486-slackware-linux --host=i486-slackware-linux
Thread model: posix
gcc version 3.4.6

-- 
---
Bob Tracy   WTO + WIPO = DMCA? http://www.anti-dmca.org
[EMAIL PROTECTED]
---
--- linux/drivers/scsi/aic7xxx/aicasm/aicasm_gram.y.orig2006-02-06 
06:00:58.0 -0600
+++ linux/drivers/scsi/aic7xxx/aicasm/aicasm_gram.y 2007-04-16 
12:31:08.0 -0500
@@ -1943,7 +1943,7 @@
versions[newlen + oldlen + 1] = '\0';
 }
 
-void
+/* void */
 yyerror(const char *string)
 {
stop(string, EX_DATAERR);
--- linux/drivers/scsi/aic7xxx/aicasm/aicasm_macro_gram.y.orig  2002-12-24 
07:09:30.0 -0600
+++ linux/drivers/scsi/aic7xxx/aicasm/aicasm_macro_gram.y   2007-04-16 
12:32:48.0 -0500
@@ -157,7 +157,7 @@
}
 }
 
-void
+/* void */
 mmerror(const char *string)
 {
stop(string, EX_DATAERR);


Re: BAD_SG_DMA panic in aha1542

2007-05-01 Thread Bob Tracy
Alan Cox wrote:
> The one I sent has a memory leak but it won't matter for basic testing.
> Or you can change the final bit to
> 
> 
> scsi_normalize_sense((char *)sense, sizeof(*sense), &sshdr);
> 
> if (zebedee != cgc->buffer) {
> if (cgc->data_direction == DMA_FROM_DEVICE)
> memcpy(cgc->buffer, zebedee, cgc->buflen);
> kfree(zebedee); /* Time for bed */
> }

I changed it, because I'll be living with this for a while I'd bet...

Works fine.  No more BAD_SG_DMA() calls.  Thanks!

-- 
-------
Bob Tracy   WTO + WIPO = DMCA? http://www.anti-dmca.org
[EMAIL PROTECTED]
---
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BAD_SG_DMA panic in aha1542

2007-05-01 Thread Bob Tracy
 drivers/scsi/aha1542.c:78/BAD_SG_DMA()
 [] aha1542_queuecommand+0x4a4/0x4ce [aha1542]
 [] scsi_done+0x0/0x16 [scsi_mod]
 [] scsi_dispatch_cmd+0x1b0/0x223 [scsi_mod]
 [] scsi_request_fn+0x22e/0x2ac [scsi_mod]
 [] blk_run_queue+0x2a/0x4b
 [] scsi_queue_insert+0x75/0x7d [scsi_mod]
 [] blk_done_softirq+0x4a/0x55
 [] __do_softirq+0x35/0x75
 [] do_softirq+0x22/0x26
 [] do_IRQ+0x48/0x50
 [] common_interrupt+0x1a/0x20
 [] scsi_dispatch_cmd+0x1b4/0x223 [scsi_mod]
 [] scsi_request_fn+0x22e/0x2ac [scsi_mod]
 [] __generic_unplug_device+0x1d/0x1f
 [] blk_execute_rq_nowait+0x64/0x6a
 [] blk_execute_rq+0x6e/0x8f
 [] blk_end_sync_rq+0x0/0x1d
 [] mempool_alloc+0x1c/0x97
 [] bio_phys_segments+0xe/0x14
 [] blk_rq_bio_prep+0x28/0x7c
 [] scsi_execute+0xc6/0xd9 [scsi_mod]
 [] sr_do_ioctl+0x80/0x1bd [sr_mod]
 [] scsi_set_medium_removal+0x43/0x67 [scsi_mod]
 [] sr_packet+0x1a/0x1f [sr_mod]
 [] cdrom_open+0x337/0x8ae [cdrom]
 [] wait_for_completion+0x5b/0x84
 [] default_wake_function+0x0/0xc
 [] call_usermodehelper_keys+0xa6/0xb2
 [] __call_usermodehelper+0x0/0x43
 [] request_module+0xc2/0xd0
 [] kobject_get+0xf/0x13
 [] sr_block_open+0x74/0x81 [sr_mod]
 [] do_open+0x8a/0x313
 [] scsi_request_fn+0x273/0x2ac [scsi_mod]
 [] io_schedule+0xe/0x16
 [] __wait_on_bit+0x50/0x58
 [] sync_buffer+0x0/0x2e
 [] sync_buffer+0x0/0x2e
 [] out_of_line_wait_on_bit+0x62/0x6a
 [] wake_bit_function+0x0/0x3c
 [] __wait_on_buffer+0x1c/0x1f
 [] __ext3_get_inode_loc+0x263/0x2b1 [ext3]
 [] d_splice_alias+0xa9/0xc3
 [] ext3_lookup+0x98/0xb8 [ext3]
 [] do_lookup+0x4f/0x135
 [] dput+0x1a/0x10b
 [] __link_path_walk+0xa5d/0xba8
 [] blkdev_get+0x55/0x60
 [] open_bdev_excl+0x32/0x6e
 [] get_sb_bdev+0x14/0x115
 [] isofs_get_sb+0x12/0x16 [isofs]
 [] isofs_fill_super+0x0/0x899 [isofs]
 [] vfs_kern_mount+0x88/0xfd
 [] do_kern_mount+0x26/0x36
 [] do_mount+0x589/0x5fb
 [] mntput_no_expire+0x11/0x59
 [] mntput_no_expire+0x11/0x59
 [] link_path_walk+0xaf/0xb9
 [] __handle_mm_fault+0x341/0x620
 [] do_path_lookup+0x195/0x1b5
 [] __handle_mm_fault+0x187/0x620
 [] get_page_from_freelist+0x6e/0x2bb
 [] __get_free_pages+0x25/0x3e
 [] copy_mount_options+0x27/0x10a
 [] sys_mount+0x6a/0xa2
 [] syscall_call+0x7/0xb
sgpnt[0:1] page c3489af0/0x3489af0 length 32
BUG: warning at drivers/scsi/aha1542.c:78/BAD_SG_DMA()
 [] aha1542_queuecommand+0x4a4/0x4ce [aha1542]
 [] scsi_done+0x0/0x16 [scsi_mod]
 [] scsi_dispatch_cmd+0x1b0/0x223 [scsi_mod]
 [] scsi_request_fn+0x22e/0x2ac [scsi_mod]
 [] blk_run_queue+0x2a/0x4b
 [] scsi_queue_insert+0x75/0x7d [scsi_mod]
 [] blk_done_softirq+0x4a/0x55
 [] __do_softirq+0x35/0x75
 [] do_softirq+0x22/0x26
 [] do_IRQ+0x48/0x50
 [] common_interrupt+0x1a/0x20
 [] scsi_dispatch_cmd+0x1b4/0x223 [scsi_mod]
 [] scsi_request_fn+0x22e/0x2ac [scsi_mod]
 [] __generic_unplug_device+0x1d/0x1f
 [] blk_execute_rq_nowait+0x64/0x6a
 [] blk_execute_rq+0x6e/0x8f
 [] blk_end_sync_rq+0x0/0x1d
 [] mempool_alloc+0x1c/0x97
 [] bio_phys_segments+0xe/0x14
 [] blk_rq_bio_prep+0x28/0x7c
 [] scsi_execute+0xc6/0xd9 [scsi_mod]
 [] sr_do_ioctl+0x80/0x1bd [sr_mod]
 [] scsi_set_medium_removal+0x43/0x67 [scsi_mod]
 [] sr_packet+0x1a/0x1f [sr_mod]
 [] cdrom_open+0x337/0x8ae [cdrom]
 [] wait_for_completion+0x5b/0x84
 [] default_wake_function+0x0/0xc
 [] call_usermodehelper_keys+0xa6/0xb2
 [] __call_usermodehelper+0x0/0x43
 [] request_module+0xc2/0xd0
 [] kobject_get+0xf/0x13
 [] sr_block_open+0x74/0x81 [sr_mod]
 [] do_open+0x8a/0x313
 [] scsi_request_fn+0x273/0x2ac [scsi_mod]
 [] io_schedule+0xe/0x16
 [] __wait_on_bit+0x50/0x58
 [] sync_buffer+0x0/0x2e
 [] sync_buffer+0x0/0x2e
 [] out_of_line_wait_on_bit+0x62/0x6a
 [] wake_bit_function+0x0/0x3c
 [] __wait_on_buffer+0x1c/0x1f
 [] __ext3_get_inode_loc+0x263/0x2b1 [ext3]
 [] d_splice_alias+0xa9/0xc3
 [] ext3_lookup+0x98/0xb8 [ext3]
 [] do_lookup+0x4f/0x135
 [] dput+0x1a/0x10b
 [] __link_path_walk+0xa5d/0xba8
 [] blkdev_get+0x55/0x60
 [] open_bdev_excl+0x32/0x6e
 [] get_sb_bdev+0x14/0x115
 [] isofs_get_sb+0x12/0x16 [isofs]
 [] isofs_fill_super+0x0/0x899 [isofs]
 [] vfs_kern_mount+0x88/0xfd
 [] do_kern_mount+0x26/0x36
 [] do_mount+0x589/0x5fb
 [] mntput_no_expire+0x11/0x59
 [] mntput_no_expire+0x11/0x59
 [] link_path_walk+0xaf/0xb9
 [] __handle_mm_fault+0x341/0x620
 [] do_path_lookup+0x195/0x1b5
 [] __handle_mm_fault+0x187/0x620
 [] get_page_from_freelist+0x6e/0x2bb
 [] __get_free_pages+0x25/0x3e
 [] copy_mount_options+0x27/0x10a
 [] sys_mount+0x6a/0xa2
 [] syscall_call+0x7/0xb
ISO 9660 Extensions: Microsoft Joliet Level 1
ISOFS: changing to secondary root

-- 
-----------
Bob Tracy   WTO + WIPO = DMCA? http://www.anti-dmca.org
[EMAIL PROTECTED]
---
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BAD_SG_DMA panic in aha1542

2007-04-30 Thread Bob Tracy
rct wrote:
> Apologies to all concerned for an unfortunate delay in resolving this.
> (...)
> I'll go retrieve a more conservatively-configured source tree (closer to
> what DSL-N uses) and start over...

Success with the Debian 2.6.18-4-486 build, which is known to work
almost as well on the test platform as the 2.6.12 kernel that DSL-N
comes with.  I used an older compiler than Debian used for their
production build, so I binary-patched the aha1542.ko and sr_mod.ko
files so insmod wouldn't complain about different vermagic strings.

That's as close as it gets without redoing everything from scratch.
I'll give Alan's and James' patches a go within the next 13 hours.

(Alan: what *else* would you name a variable associated with a bounce
buffer besides Zebedee?  Thanks for the occasion to smile...)

-- 
-------
Bob Tracy   WTO + WIPO = DMCA? http://www.anti-dmca.org
[EMAIL PROTECTED]
---
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BAD_SG_DMA panic in aha1542

2007-04-30 Thread Bob Tracy
Apologies to all concerned for an unfortunate delay in resolving this.
I chose "unwisely" when I picked a popular experimental distro's
2.6.20 kernel source as a base for my troubleshooting efforts.  The
resulting kernel panics when it tries to load the initial ramdisk, and
I don't have the patience to track down which of the turned-on-by-default
experimental configuration parameters might be causing the problem.

I'll go retrieve a more conservatively-configured source tree (closer to
what DSL-N uses) and start over...

-- 
-----------
Bob Tracy   WTO + WIPO = DMCA? http://www.anti-dmca.org
[EMAIL PROTECTED]
---
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BAD_SG_DMA panic in aha1542

2007-04-27 Thread Bob Tracy
Alan Cox wrote:
> > As before, no problems using the sda hard disk (which is the boot drive):
> > everything works reliably until I touch the cdrom drive.
> 
> A little quiet contemplation and gnome number 387 suggests trying the 
> following
> (and providing more detailed information such as the last message printed 
> before
> the DMA message). Stuff a BUG() before the panic in BAD_DMA (aha1542.c) if 
> needed
> to get a good trace.
> 
> Please report success/failure/change.

Can do.  I don't have access to the machine on weekends, so it will be
at least Monday before I can give this a whirl.  Thanks!

-- 
-------
Bob Tracy   WTO + WIPO = DMCA? http://www.anti-dmca.org
[EMAIL PROTECTED]
---
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BAD_SG_DMA panic in aha1542

2007-04-27 Thread Bob Tracy
James Bottomley wrote:
> On Fri, 2007-04-27 at 16:47 -0500, Bob Tracy wrote:
> > I previously reported an ISA DMA issue for the 2.6.12 kernel.  The issue
> > persists through at least 2.6.18.  SCSI controller is an Adaptec
> > AHA-1542B (ISA).
> > 
> > The action "mount -t iso9660 /dev/scd0 /mnt/cdrom -r"
> > 
> > produces
> > 
> > (cdrom detection messages as various modules autoload, then...)
> 
> Knowing what these messages are is would be helpful; it tells me what
> point in the initialisation it got to. 

Sorry about that...  I'm running the DSL-N distribution (based on
Knoppix), and having to transcribe the log messages by hand from the
console, i.e., there's no logfile to cut-and-paste from :-(.  I don't
have access to the machine except on weekdays, but I'll repeat the
crash first thing Monday morning and copy everything that's there...

> I'm interested.
> 
> This is clearly a use_sg==1 path that has failed to bounce the buffer
> for some reason ... and I was contemplating eliminating the GFP_DMA from
> our sr driver because I thought the block bouncing had it covered.
> 
> It might also be helpful to apply this patch.  It should give a stack
> trace of the problem command and not immediately panic the box.

I'll throw together a 2.6.21 kernel with this patch and give it a try.
Again, it will be at least Monday before you hear back from me on this.

Thanks!

-- 
---
Bob Tracy   WTO + WIPO = DMCA? http://www.anti-dmca.org
[EMAIL PROTECTED]
---
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


BAD_SG_DMA panic in aha1542

2007-04-27 Thread Bob Tracy
I previously reported an ISA DMA issue for the 2.6.12 kernel.  The issue
persists through at least 2.6.18.  SCSI controller is an Adaptec
AHA-1542B (ISA).

The action "mount -t iso9660 /dev/scd0 /mnt/cdrom -r"

produces

(cdrom detection messages as various modules autoload, then...)
sgpnt[0:1] page c1ee5af0/0x1ee5af0 length 32
Kernel panic - not syncing: Buffer at physical address > 16 Mb used for aha1542

As before, no problems using the sda hard disk (which is the boot drive):
everything works reliably until I touch the cdrom drive.

I'll be happy to assist with the debugging, but the system with the
aha1542 has no development facilities, i.e., I'll have to build test
kernels on a different system, and turnaround is going to be slow :-(.

Thanks in advance for helping me get this old machine working again.
No issues with 2.4 kernels.  I have no idea about 2.5 kernels and
2.6 kernels prior to 2.6.12.  As for why I didn't report this before
now, the aha1542b was in my parts bin until I cobbled a system together
approx. two weeks ago, mostly to see if a useful system could still be
had using legacy hardware and modern GNU/Linux software.  I'm happy to
report the answer is mostly "yes".

-- 
-------
Bob Tracy   WTO + WIPO = DMCA? http://www.anti-dmca.org
[EMAIL PROTECTED]
---
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


old ISA DMA bug in 2.6.12?

2007-04-24 Thread Bob Tracy
I was enjoying yet another session of beating my head against the wall
trying to do useful things with old hardware :-), and managed to cause a
kernel panic by simply trying to mount a cdrom in the context of a DSL-N
installation.

The SCSI host adapter is an Adaptec AHA-1542B, and when I try to mount a
cdrom, I manage to run afoul of the BAD_DMA() check in aha1542.c: the
buffer returned is not in the lower 16 MB of memory.

The same 2.6.12 kernel + hardware combination works fine as long as I
confine my I/O to the hard disk that's also attached to the AHA-1542B.

-- 
-------
Bob Tracy   WTO + WIPO = DMCA? http://www.anti-dmca.org
[EMAIL PROTECTED]
---
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[BUG] aic7xxx/aicasm build failure w/gcc-3.4.6

2007-04-17 Thread Bob Tracy
(Sent to linux-kernel yesterday.  Forgot to "Cc:" here.)

This showed up during a 2.6.21-rc7 build after I upgraded gcc from 3.3
to 3.4 on a Slackware system:

make -C drivers/scsi/aic7xxx/aicasm
(...)
gcc -I/usr/include -I. aicasm.c aicasm_symbol.c aicasm_gram.c 
aicasm_macro_gram.c aicasm_scan.c aicasm_macro_scan.c -o aicasm -ldb
aicasm_gram.y:1948: error: conflicting types for 'yyerror'
aicasm_gram.tab.c:3004: error: previous implicit declaration of 'yyerror' was 
here
aicasm_macro_gram.y:162: error: conflicting types for 'mmerror'
aicasm_macro_gram.tab.c:1196: error: previous implicit declaration of 'mmerror' 
was here

As a workaround, deleting the "void" declarations for yyerror() and
mmerror() in the respective ".y" files fixes the problem.  gcc-3.3
may have been warning about the type conflicts, but I didn't notice:
gcc-3.4 treats them as errors, so I *did* notice :-).

Here's the "gcc -v" output:

Reading specs from /usr/lib/gcc/i486-slackware-linux/3.4.6/specs
Configured with: ../gcc-3.4.6/configure --prefix=/usr --enable-shared 
--enable-threads=posix --enable-__cxa_atexit --disable-checking --with-gnu-ld 
--verbose --target=i486-slackware-linux --host=i486-slackware-linux
Thread model: posix
gcc version 3.4.6

-- 
---
Bob Tracy   WTO + WIPO = DMCA? http://www.anti-dmca.org
[EMAIL PROTECTED]
---
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html