Re: Cas driver fails to load first time after boot.

2013-01-28 Thread Paul Keusemann


On 01/25/13 17:34, Marius Strobl wrote:

On Fri, Jan 25, 2013 at 01:14:51PM -0600, Paul Keusemann wrote:

On 01/25/13 10:19, Marius Strobl wrote:

On Thu, Jan 24, 2013 at 08:48:04PM -0600, Paul Keusemann wrote:

On 01/24/13 15:50, Marius Strobl wrote:

On Thu, Jan 24, 2013 at 12:39:44PM -0600, Paul Keusemann wrote:

On 01/24/13 09:09, Marius Strobl wrote:

On Tue, Jan 22, 2013 at 02:46:48PM -0600, Paul Keusemann wrote:

Hi,

I've got a Dell R200 which I'm trying to build into a gateway with a Sun
QGE (501-6738-10).  The cas driver fails to load the first time I try to
load it but succeeds the second time.  Is this a problem with the card,
the driver, my karma?

Wrong phase of the moon, apparently :)
The MII setup of these chips is a bit tricky and I'm not sure whether
I've hit all code paths during development of the driver. I certainly
didn't test with a 501-6738, these have been reported as working before,
though. It also doesn't make much sense that attaching the devices
succeeds on the second attempt. Could you please use a if_cas.ko built
with the attached patch and report the debug output for one of the
interfaces in both the working and the non-working case?

I would love to give you output from the working and non-working case
but apparently the phase of the moon has changed, I can't get it to fail
now.  The messages output from the working case is attached.


Thanks but unfortunately this doesn't make any sense either. In general,
printf()s cause deays which can be relevant. In the locations I've put
them they hardly can make such a difference though.
If you haven't already done so, could you please power off the machine
before doing the test with the patched module? Is the problem still gone
if you revert to the original module?

OK, power-cycling makes a difference.  The driver fails to attach all of
the devices after power-cycling most of the time if not all of the
time.  The number of devices attached varies, the attached message file
fragment is from my last test.  Three of the devices were attached on
the first load attempt and all four of them on the second attempt.

Okay, so we now at least have a way to reproduce the problem.
Unfortunately, it's still unclear what's the exact cause of it. At
least the problem is not what I suspected and hoped it most likely is.
Could you please test how things behave after a power-cycle with the
attached patche (after reverting the previous one).

The patched driver fails to compile with the following error message:


...


I found the following defintion of nitems in the iwn and usb/wlan drivers:

#define nitems(_a)  (sizeof((_a)) / sizeof((_a)[0]))

so I added it to if_cas.c and rebuilt without errors.


Sorry, I didn't think of 8.3 not having nitems(), yet. Actually,
this part of the patch is orthogonal to your problem and just a
change I had in that tree.


This looks like like it fixed the problem.  I ran three tests from
power-up to loading the driver and the driver loaded successfully all
three times.  I then added if_cas_load=YES to /boot/loader.conf and
did two more successful reboots from power-up.

Great! Thanks a lot for testing!


Will this driver work on FreeBSD 9.1?


Yes, the patch should also solve the problem in 9.1. I suspect the
hang you are seeing there isn't specific to cas(4) but rather a
general regression that came in with the VIMAGE changes. Now, if
a network device driver fails to attach during boot and tries to
clean up by detaching and freeing the interface part at that stage
again this causes problems. I already talked to bz@ about this and
what I remember from his reply this is an ordering issue that is at
least very hard to fix.


OK.  I've successfully upgraded from 8.3-Release to 9.1-Release.  I 
stupidly powered-down the machine after the upgrade, so I had to remove 
the QGE card to get it to boot 9.1 and build a custom kernel.  The patch 
applied cleanly, the kernel built without errors and boots from power-up 
without problems.  I've attached the most recent messages file, dmesg, 
kldstat and ifconfig output if you're interested.  The only odd thing I 
noticed was that cas0 and cas3 log messages:  cannot disable RX MAC 
but cas1 and cas2 do not.  I haven't actually tried any of the 
interfaces yet but I assume they'll work as expected.


Let me know if there's anything further testing you'd like me to do.

Thanks so much for your help with this, it is much appreciated.

Paul




Marius




--
Paul Keusemannpkeu...@visi.com
4266 Joppa Court  (952) 894-7805
Savage, MN  55378

Copyright (c) 1992-2012 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 9.1-RELEASE #4: Mon Jan 28 09:02:45 CST 2013
toor@lucid:/usr/obj/usr/src/sys/LUCID amd64
CPU: Intel(R) 

Re: Cas driver fails to load first time after boot.

2013-01-25 Thread Marius Strobl
On Thu, Jan 24, 2013 at 08:48:04PM -0600, Paul Keusemann wrote:
 
 On 01/24/13 15:50, Marius Strobl wrote:
  On Thu, Jan 24, 2013 at 12:39:44PM -0600, Paul Keusemann wrote:
  On 01/24/13 09:09, Marius Strobl wrote:
  On Tue, Jan 22, 2013 at 02:46:48PM -0600, Paul Keusemann wrote:
  Hi,
 
  I've got a Dell R200 which I'm trying to build into a gateway with a Sun
  QGE (501-6738-10).  The cas driver fails to load the first time I try to
  load it but succeeds the second time.  Is this a problem with the card,
  the driver, my karma?
  Wrong phase of the moon, apparently :)
  The MII setup of these chips is a bit tricky and I'm not sure whether
  I've hit all code paths during development of the driver. I certainly
  didn't test with a 501-6738, these have been reported as working before,
  though. It also doesn't make much sense that attaching the devices
  succeeds on the second attempt. Could you please use a if_cas.ko built
  with the attached patch and report the debug output for one of the
  interfaces in both the working and the non-working case?
  I would love to give you output from the working and non-working case
  but apparently the phase of the moon has changed, I can't get it to fail
  now.  The messages output from the working case is attached.
 
  Thanks but unfortunately this doesn't make any sense either. In general,
  printf()s cause deays which can be relevant. In the locations I've put
  them they hardly can make such a difference though.
  If you haven't already done so, could you please power off the machine
  before doing the test with the patched module? Is the problem still gone
  if you revert to the original module?
 
 OK, power-cycling makes a difference.  The driver fails to attach all of 
 the devices after power-cycling most of the time if not all of the 
 time.  The number of devices attached varies, the attached message file 
 fragment is from my last test.  Three of the devices were attached on 
 the first load attempt and all four of them on the second attempt.

Okay, so we now at least have a way to reproduce the problem.
Unfortunately, it's still unclear what's the exact cause of it. At
least the problem is not what I suspected and hoped it most likely is.
Could you please test how things behave after a power-cycle with the
attached patche (after reverting the previous one).

Marius

Index: if_cas.c
===
--- if_cas.c	(revision 245046)
+++ if_cas.c	(working copy)
@@ -214,8 +214,12 @@ cas_attach(struct cas_softc *sc)
 		error = ENXIO;
 		goto fail_ifnet;
 	}
-	taskqueue_start_threads(sc-sc_tq, 1, PI_NET, %s taskq,
+	error = taskqueue_start_threads(sc-sc_tq, 1, PI_NET, %s taskq,
 	device_get_nameunit(sc-sc_dev));
+	if (error != 0) {
+		device_printf(sc-sc_dev, could not start threads\n);
+		goto fail_taskq;
+	}
 
 	/* Make sure the chip is stopped. */
 	cas_reset(sc);
@@ -339,10 +343,13 @@ cas_attach(struct cas_softc *sc)
 			BUS_SPACE_BARRIER_READ | BUS_SPACE_BARRIER_WRITE);
 			/* Enable/unfreeze the GMII pins of Saturn. */
 			if (sc-sc_variant == CAS_SATURN) {
-CAS_WRITE_4(sc, CAS_SATURN_PCFG, 0);
+CAS_WRITE_4(sc, CAS_SATURN_PCFG,
+CAS_READ_4(sc, CAS_SATURN_PCFG) 
+~CAS_SATURN_PCFG_FSI);
 CAS_BARRIER(sc, CAS_SATURN_PCFG, 4,
 BUS_SPACE_BARRIER_READ |
 BUS_SPACE_BARRIER_WRITE);
+DELAY(1);
 			}
 			error = mii_attach(sc-sc_dev, sc-sc_miibus, ifp,
 			cas_mediachange, cas_mediastatus, BMSR_DEFCAPMASK,
@@ -359,10 +366,12 @@ cas_attach(struct cas_softc *sc)
 			/* Freeze the GMII pins of Saturn for saving power. */
 			if (sc-sc_variant == CAS_SATURN) {
 CAS_WRITE_4(sc, CAS_SATURN_PCFG,
+CAS_READ_4(sc, CAS_SATURN_PCFG) |
 CAS_SATURN_PCFG_FSI);
 CAS_BARRIER(sc, CAS_SATURN_PCFG, 4,
 BUS_SPACE_BARRIER_READ |
 BUS_SPACE_BARRIER_WRITE);
+DELAY(1);
 			}
 			error = mii_attach(sc-sc_dev, sc-sc_miibus, ifp,
 			cas_mediachange, cas_mediastatus, BMSR_DEFCAPMASK,
@@ -2865,7 +2874,7 @@ cas_pci_attach(device_t dev)
 		goto fail;
 	}
 	i = 0;
-	if (lma  1  pci_get_slot(dev)  sizeof(enaddr) / sizeof(*enaddr))
+	if (lma  1  pci_get_slot(dev)  nitems(enaddr))
 		i = pci_get_slot(dev);
 	memcpy(sc-sc_enaddr, enaddr[i], ETHER_ADDR_LEN);
 
@@ -2874,7 +2883,7 @@ cas_pci_attach(device_t dev)
 		goto fail;
 	}
 	i = 0;
-	if (phy  1  pci_get_slot(dev)  sizeof(pcs) / sizeof(*pcs))
+	if (phy  1  pci_get_slot(dev)  nitems(pcs))
 		i = pci_get_slot(dev);
 	if (pcs[i] != 0)
 		sc-sc_flags |= CAS_SERDES;
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org

Re: Cas driver fails to load first time after boot.

2013-01-25 Thread Marius Strobl
On Fri, Jan 25, 2013 at 01:14:51PM -0600, Paul Keusemann wrote:
 
 On 01/25/13 10:19, Marius Strobl wrote:
  On Thu, Jan 24, 2013 at 08:48:04PM -0600, Paul Keusemann wrote:
  On 01/24/13 15:50, Marius Strobl wrote:
  On Thu, Jan 24, 2013 at 12:39:44PM -0600, Paul Keusemann wrote:
  On 01/24/13 09:09, Marius Strobl wrote:
  On Tue, Jan 22, 2013 at 02:46:48PM -0600, Paul Keusemann wrote:
  Hi,
 
  I've got a Dell R200 which I'm trying to build into a gateway with a 
  Sun
  QGE (501-6738-10).  The cas driver fails to load the first time I try 
  to
  load it but succeeds the second time.  Is this a problem with the card,
  the driver, my karma?
  Wrong phase of the moon, apparently :)
  The MII setup of these chips is a bit tricky and I'm not sure whether
  I've hit all code paths during development of the driver. I certainly
  didn't test with a 501-6738, these have been reported as working before,
  though. It also doesn't make much sense that attaching the devices
  succeeds on the second attempt. Could you please use a if_cas.ko built
  with the attached patch and report the debug output for one of the
  interfaces in both the working and the non-working case?
  I would love to give you output from the working and non-working case
  but apparently the phase of the moon has changed, I can't get it to fail
  now.  The messages output from the working case is attached.
 
  Thanks but unfortunately this doesn't make any sense either. In general,
  printf()s cause deays which can be relevant. In the locations I've put
  them they hardly can make such a difference though.
  If you haven't already done so, could you please power off the machine
  before doing the test with the patched module? Is the problem still gone
  if you revert to the original module?
  OK, power-cycling makes a difference.  The driver fails to attach all of
  the devices after power-cycling most of the time if not all of the
  time.  The number of devices attached varies, the attached message file
  fragment is from my last test.  Three of the devices were attached on
  the first load attempt and all four of them on the second attempt.
  Okay, so we now at least have a way to reproduce the problem.
  Unfortunately, it's still unclear what's the exact cause of it. At
  least the problem is not what I suspected and hoped it most likely is.
  Could you please test how things behave after a power-cycle with the
  attached patche (after reverting the previous one).
 
 The patched driver fails to compile with the following error message:
 

...

 
 I found the following defintion of nitems in the iwn and usb/wlan drivers:
 
 #define nitems(_a)  (sizeof((_a)) / sizeof((_a)[0]))
 
 so I added it to if_cas.c and rebuilt without errors.
 

Sorry, I didn't think of 8.3 not having nitems(), yet. Actually,
this part of the patch is orthogonal to your problem and just a
change I had in that tree.

 This looks like like it fixed the problem.  I ran three tests from 
 power-up to loading the driver and the driver loaded successfully all 
 three times.  I then added if_cas_load=YES to /boot/loader.conf and 
 did two more successful reboots from power-up.

Great! Thanks a lot for testing!

 
 Will this driver work on FreeBSD 9.1?
 

Yes, the patch should also solve the problem in 9.1. I suspect the
hang you are seeing there isn't specific to cas(4) but rather a
general regression that came in with the VIMAGE changes. Now, if
a network device driver fails to attach during boot and tries to
clean up by detaching and freeing the interface part at that stage
again this causes problems. I already talked to bz@ about this and
what I remember from his reply this is an ordering issue that is at
least very hard to fix.

Marius

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Cas driver fails to load first time after boot.

2013-01-24 Thread Marius Strobl
On Tue, Jan 22, 2013 at 02:46:48PM -0600, Paul Keusemann wrote:
 Hi,
 
 I've got a Dell R200 which I'm trying to build into a gateway with a Sun 
 QGE (501-6738-10).  The cas driver fails to load the first time I try to 
 load it but succeeds the second time.  Is this a problem with the card, 
 the driver, my karma?

Wrong phase of the moon, apparently :)
The MII setup of these chips is a bit tricky and I'm not sure whether
I've hit all code paths during development of the driver. I certainly
didn't test with a 501-6738, these have been reported as working before,
though. It also doesn't make much sense that attaching the devices
succeeds on the second attempt. Could you please use a if_cas.ko built
with the attached patch and report the debug output for one of the
interfaces in both the working and the non-working case?

Marius

Index: if_cas.c
===
--- if_cas.c	(revision 245046)
+++ if_cas.c	(working copy)
@@ -332,6 +332,8 @@ cas_attach(struct cas_softc *sc)
 		 */
 		error = ENXIO;
 		v = CAS_READ_4(sc, CAS_MIF_CONF);
+device_printf(sc-sc_dev, MIF=0x%x PCFG=0x%x\n, v,
+CAS_READ_4(sc, CAS_SATURN_PCFG));
 		if ((v  CAS_MIF_CONF_MDI1) != 0) {
 			v |= CAS_MIF_CONF_PHY_SELECT;
 			CAS_WRITE_4(sc, CAS_MIF_CONF, v);
@@ -347,6 +349,8 @@ cas_attach(struct cas_softc *sc)
 			error = mii_attach(sc-sc_dev, sc-sc_miibus, ifp,
 			cas_mediachange, cas_mediastatus, BMSR_DEFCAPMASK,
 			MII_PHY_ANY, MII_OFFSET_ANY, MIIF_DOPAUSE);
+if (error == 0)
+device_printf(sc-sc_dev, external PHY\n);
 		}
 		/*
 		 * Fall back on an internal PHY if no external PHY was found.
@@ -367,6 +371,8 @@ cas_attach(struct cas_softc *sc)
 			error = mii_attach(sc-sc_dev, sc-sc_miibus, ifp,
 			cas_mediachange, cas_mediastatus, BMSR_DEFCAPMASK,
 			MII_PHY_ANY, MII_OFFSET_ANY, MIIF_DOPAUSE);
+if (error == 0)
+device_printf(sc-sc_dev, internal PHY\n);
 		}
 	} else {
 		/*
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org

Re: Cas driver fails to load first time after boot.

2013-01-24 Thread Paul Keusemann


On 01/24/13 09:09, Marius Strobl wrote:

On Tue, Jan 22, 2013 at 02:46:48PM -0600, Paul Keusemann wrote:

Hi,

I've got a Dell R200 which I'm trying to build into a gateway with a Sun
QGE (501-6738-10).  The cas driver fails to load the first time I try to
load it but succeeds the second time.  Is this a problem with the card,
the driver, my karma?

Wrong phase of the moon, apparently :)
The MII setup of these chips is a bit tricky and I'm not sure whether
I've hit all code paths during development of the driver. I certainly
didn't test with a 501-6738, these have been reported as working before,
though. It also doesn't make much sense that attaching the devices
succeeds on the second attempt. Could you please use a if_cas.ko built
with the attached patch and report the debug output for one of the
interfaces in both the working and the non-working case?


I would love to give you output from the working and non-working case 
but apparently the phase of the moon has changed, I can't get it to fail 
now.  The messages output from the working case is attached.


Let me know if there's anything else I can do.


Marius



--
Paul Keusemannpkeu...@visi.com
4266 Joppa Court  (952) 894-7805
Savage, MN  55378

Jan 24 11:00:01 lucid newsyslog[2087]: logfile turned over due to size100K
Jan 24 11:47:39 lucid shutdown: reboot by toor: 
Jan 24 11:47:41 lucid syslogd: exiting on signal 15
Jan 24 11:48:51 lucid syslogd: kernel boot file is /boot/kernel/kernel
Jan 24 11:48:51 lucid kernel: Copyright (c) 1992-2012 The FreeBSD Project.
Jan 24 11:48:51 lucid kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 
1991, 1992, 1993, 1994
Jan 24 11:48:51 lucid kernel: The Regents of the University of California. All 
rights reserved.
Jan 24 11:48:51 lucid kernel: FreeBSD is a registered trademark of The FreeBSD 
Foundation.
Jan 24 11:48:51 lucid kernel: FreeBSD 8.3-RELEASE #0: Thu Jan 24 11:15:13 CST 
2013
Jan 24 11:48:51 lucid kernel: toor@lucid:/usr/obj/usr/src/sys/LUCID amd64
Jan 24 11:48:51 lucid kernel: Timecounter i8254 frequency 1193182 Hz quality 0
Jan 24 11:48:51 lucid kernel: CPU: Intel(R) Xeon(R) CPU   X3210  @ 
2.13GHz (2133.42-MHz K8-class CPU)
Jan 24 11:48:51 lucid kernel: Origin = GenuineIntel  Id = 0x6fb  Family = 6  
Model = f  Stepping = 11
Jan 24 11:48:51 lucid kernel: 
Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
Jan 24 11:48:51 lucid kernel: 
Features2=0xe3bdSSE3,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM
Jan 24 11:48:51 lucid kernel: AMD Features=0x20100800SYSCALL,NX,LM
Jan 24 11:48:51 lucid kernel: AMD Features2=0x1LAHF
Jan 24 11:48:51 lucid kernel: TSC: P-state invariant
Jan 24 11:48:51 lucid kernel: real memory  = 4294967296 (4096 MB)
Jan 24 11:48:51 lucid kernel: avail memory = 4099231744 (3909 MB)
Jan 24 11:48:51 lucid kernel: ACPI APIC Table: DELL   PE_SC3  
Jan 24 11:48:51 lucid kernel: FreeBSD/SMP: Multiprocessor System Detected: 4 
CPUs
Jan 24 11:48:51 lucid kernel: FreeBSD/SMP: 1 package(s) x 4 core(s)
Jan 24 11:48:51 lucid kernel: cpu0 (BSP): APIC ID:  0
Jan 24 11:48:51 lucid kernel: cpu1 (AP): APIC ID:  1
Jan 24 11:48:51 lucid kernel: cpu2 (AP): APIC ID:  2
Jan 24 11:48:51 lucid kernel: cpu3 (AP): APIC ID:  3
Jan 24 11:48:51 lucid kernel: ioapic0: Changing APIC ID to 4
Jan 24 11:48:51 lucid kernel: ioapic1: Changing APIC ID to 5
Jan 24 11:48:51 lucid kernel: ioapic0 Version 2.0 irqs 0-23 on motherboard
Jan 24 11:48:51 lucid kernel: ioapic1 Version 2.0 irqs 32-55 on motherboard
Jan 24 11:48:51 lucid kernel: kbd1 at kbdmux0
Jan 24 11:48:51 lucid kernel: acpi0: DELL PE_SC3 on motherboard
Jan 24 11:48:51 lucid kernel: acpi0: [ITHREAD]
Jan 24 11:48:51 lucid kernel: acpi0: Power Button (fixed)
Jan 24 11:48:51 lucid kernel: Timecounter ACPI-fast frequency 3579545 Hz 
quality 1000
Jan 24 11:48:51 lucid kernel: acpi_timer0: 24-bit timer at 3.579545MHz port 
0x808-0x80b on acpi0
Jan 24 11:48:51 lucid kernel: cpu0: ACPI CPU on acpi0
Jan 24 11:48:51 lucid kernel: cpu1: ACPI CPU on acpi0
Jan 24 11:48:51 lucid kernel: cpu2: ACPI CPU on acpi0
Jan 24 11:48:51 lucid kernel: cpu3: ACPI CPU on acpi0
Jan 24 11:48:51 lucid kernel: pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on 
acpi0
Jan 24 11:48:51 lucid kernel: pci0: ACPI PCI bus on pcib0
Jan 24 11:48:51 lucid kernel: pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 
on pci0
Jan 24 11:48:51 lucid kernel: pci1: ACPI PCI bus on pcib1
Jan 24 11:48:51 lucid kernel: pcib2: ACPI PCI-PCI bridge irq 16 at device 
28.0 on pci0
Jan 24 11:48:51 lucid kernel: pci2: ACPI PCI bus on pcib2
Jan 24 11:48:51 lucid kernel: pcib3: ACPI PCI-PCI bridge at device 0.0 on pci2
Jan 24 11:48:51 lucid kernel: pci3: ACPI PCI bus on pcib3
Jan 24 11:48:51 lucid kernel: pcib4: PCI-PCI bridge at device 2.0 on pci3
Jan 24 11:48:51 lucid kernel: pci4: PCI bus on pcib4
Jan 24 11:48:51 lucid kernel: pci4: 

Re: Cas driver fails to load first time after boot.

2013-01-24 Thread Marius Strobl
On Thu, Jan 24, 2013 at 12:39:44PM -0600, Paul Keusemann wrote:
 
 On 01/24/13 09:09, Marius Strobl wrote:
  On Tue, Jan 22, 2013 at 02:46:48PM -0600, Paul Keusemann wrote:
  Hi,
 
  I've got a Dell R200 which I'm trying to build into a gateway with a Sun
  QGE (501-6738-10).  The cas driver fails to load the first time I try to
  load it but succeeds the second time.  Is this a problem with the card,
  the driver, my karma?
  Wrong phase of the moon, apparently :)
  The MII setup of these chips is a bit tricky and I'm not sure whether
  I've hit all code paths during development of the driver. I certainly
  didn't test with a 501-6738, these have been reported as working before,
  though. It also doesn't make much sense that attaching the devices
  succeeds on the second attempt. Could you please use a if_cas.ko built
  with the attached patch and report the debug output for one of the
  interfaces in both the working and the non-working case?
 
 I would love to give you output from the working and non-working case 
 but apparently the phase of the moon has changed, I can't get it to fail 
 now.  The messages output from the working case is attached.
 

Thanks but unfortunately this doesn't make any sense either. In general,
printf()s cause deays which can be relevant. In the locations I've put
them they hardly can make such a difference though.
If you haven't already done so, could you please power off the machine
before doing the test with the patched module? Is the problem still gone
if you revert to the original module?

Marius

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Cas driver fails to load first time after boot.

2013-01-24 Thread Paul Keusemann


On 01/24/13 15:50, Marius Strobl wrote:

On Thu, Jan 24, 2013 at 12:39:44PM -0600, Paul Keusemann wrote:

On 01/24/13 09:09, Marius Strobl wrote:

On Tue, Jan 22, 2013 at 02:46:48PM -0600, Paul Keusemann wrote:

Hi,

I've got a Dell R200 which I'm trying to build into a gateway with a Sun
QGE (501-6738-10).  The cas driver fails to load the first time I try to
load it but succeeds the second time.  Is this a problem with the card,
the driver, my karma?

Wrong phase of the moon, apparently :)
The MII setup of these chips is a bit tricky and I'm not sure whether
I've hit all code paths during development of the driver. I certainly
didn't test with a 501-6738, these have been reported as working before,
though. It also doesn't make much sense that attaching the devices
succeeds on the second attempt. Could you please use a if_cas.ko built
with the attached patch and report the debug output for one of the
interfaces in both the working and the non-working case?

I would love to give you output from the working and non-working case
but apparently the phase of the moon has changed, I can't get it to fail
now.  The messages output from the working case is attached.


Thanks but unfortunately this doesn't make any sense either. In general,
printf()s cause deays which can be relevant. In the locations I've put
them they hardly can make such a difference though.
If you haven't already done so, could you please power off the machine
before doing the test with the patched module? Is the problem still gone
if you revert to the original module?


OK, power-cycling makes a difference.  The driver fails to attach all of 
the devices after power-cycling most of the time if not all of the 
time.  The number of devices attached varies, the attached message file 
fragment is from my last test.  Three of the devices were attached on 
the first load attempt and all four of them on the second attempt.


In the interest of full disclosure, I did build a new kernel but it is 
just a copy of GENERIC.  This is a




Marius




--
Paul Keusemannpkeu...@visi.com
4266 Joppa Court  (952) 894-7805
Savage, MN  55378

Jan 24 20:32:32 lucid kernel: Copyright (c) 1992-2012 The FreeBSD Project.
Jan 24 20:32:32 lucid kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 
1991, 1992, 1993, 1994
Jan 24 20:32:32 lucid kernel: The Regents of the University of California. All 
rights reserved.
Jan 24 20:32:32 lucid kernel: FreeBSD is a registered trademark of The FreeBSD 
Foundation.
Jan 24 20:32:32 lucid kernel: FreeBSD 8.3-RELEASE #0: Thu Jan 24 11:15:13 CST 
2013
Jan 24 20:32:32 lucid kernel: toor@lucid:/usr/obj/usr/src/sys/LUCID amd64
Jan 24 20:32:32 lucid kernel: Timecounter i8254 frequency 1193182 Hz quality 0
Jan 24 20:32:32 lucid kernel: CPU: Intel(R) Xeon(R) CPU   X3210  @ 
2.13GHz (2133.42-MHz K8-class CPU)
Jan 24 20:32:32 lucid kernel: Origin = GenuineIntel  Id = 0x6fb  Family = 6  
Model = f  Stepping = 11
Jan 24 20:32:32 lucid kernel: 
Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
Jan 24 20:32:32 lucid kernel: 
Features2=0xe3bdSSE3,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM
Jan 24 20:32:32 lucid kernel: AMD Features=0x20100800SYSCALL,NX,LM
Jan 24 20:32:32 lucid kernel: AMD Features2=0x1LAHF
Jan 24 20:32:32 lucid kernel: TSC: P-state invariant
Jan 24 20:32:32 lucid kernel: real memory  = 4294967296 (4096 MB)
Jan 24 20:32:32 lucid kernel: avail memory = 4099231744 (3909 MB)
Jan 24 20:32:32 lucid kernel: ACPI APIC Table: DELL   PE_SC3  
Jan 24 20:32:32 lucid kernel: FreeBSD/SMP: Multiprocessor System Detected: 4 
CPUs
Jan 24 20:32:32 lucid kernel: FreeBSD/SMP: 1 package(s) x 4 core(s)
Jan 24 20:32:32 lucid kernel: cpu0 (BSP): APIC ID:  0
Jan 24 20:32:32 lucid kernel: cpu1 (AP): APIC ID:  1
Jan 24 20:32:32 lucid kernel: cpu2 (AP): APIC ID:  2
Jan 24 20:32:32 lucid kernel: cpu3 (AP): APIC ID:  3
Jan 24 20:32:32 lucid kernel: ioapic0: Changing APIC ID to 4
Jan 24 20:32:32 lucid kernel: ioapic1: Changing APIC ID to 5
Jan 24 20:32:32 lucid kernel: ioapic0 Version 2.0 irqs 0-23 on motherboard
Jan 24 20:32:32 lucid kernel: ioapic1 Version 2.0 irqs 32-55 on motherboard
Jan 24 20:32:32 lucid kernel: kbd1 at kbdmux0
Jan 24 20:32:32 lucid kernel: acpi0: DELL PE_SC3 on motherboard
Jan 24 20:32:32 lucid kernel: acpi0: [ITHREAD]
Jan 24 20:32:32 lucid kernel: acpi0: Power Button (fixed)
Jan 24 20:32:32 lucid kernel: Timecounter ACPI-fast frequency 3579545 Hz 
quality 1000
Jan 24 20:32:32 lucid kernel: acpi_timer0: 24-bit timer at 3.579545MHz port 
0x808-0x80b on acpi0
Jan 24 20:32:32 lucid kernel: cpu0: ACPI CPU on acpi0
Jan 24 20:32:32 lucid kernel: cpu1: ACPI CPU on acpi0
Jan 24 20:32:32 lucid kernel: cpu2: ACPI CPU on acpi0
Jan 24 20:32:32 lucid kernel: cpu3: ACPI CPU on acpi0
Jan 24 20:32:32 lucid kernel: pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on 
acpi0

Cas driver fails to load first time after boot.

2013-01-22 Thread Paul Keusemann

Hi,

I've got a Dell R200 which I'm trying to build into a gateway with a Sun 
QGE (501-6738-10).  The cas driver fails to load the first time I try to 
load it but succeeds the second time.  Is this a problem with the card, 
the driver, my karma?



Initially, tried to install FreeBSD-9.1- Release but booting the 
installation DVD hangs after failing to attach cas0.  I was able to 
successfully install FreeBSD-8.3-Release which apparently does not have 
the cas driver built into the installer kernel.


The first time I try to load the cas module after booting results in the 
following output:


# kldload -v if_cas
Loaded if_cas, id=2

with the following logged to /var/log/messages:

Jan 22 13:48:27 lucid kernel: cas0: NS DP83065 Saturn Gigabit Ethernet 
mem 0xdf80-0xdf9f irq 35 at device 0.0 on pci4

Jan 22 13:48:27 lucid kernel: cas0: attaching PHYs failed
Jan 22 13:48:27 lucid kernel: cas0: could not be attached
Jan 22 13:48:27 lucid kernel: device_attach: cas0 attach returned 6
Jan 22 13:48:27 lucid kernel: cas1: NS DP83065 Saturn Gigabit Ethernet 
mem 0xdfa0-0xdfbf irq 34 at device 1.0 on pci4

Jan 22 13:48:27 lucid kernel: cas1: attaching PHYs failed
Jan 22 13:48:28 lucid kernel: cas1: could not be attached
Jan 22 13:48:28 lucid kernel: device_attach: cas1 attach returned 6
Jan 22 13:48:28 lucid kernel: cas2: NS DP83065 Saturn Gigabit Ethernet 
mem 0xdfc0-0xdfdf irq 33 at device 2.0 on pci4

Jan 22 13:48:28 lucid kernel: miibus2: MII bus on cas2
Jan 22 13:48:28 lucid kernel: nsgphy0: DP83865 10/100/1000 media 
interface PHY 1 on miibus2
Jan 22 13:48:28 lucid kernel: nsgphy0:  none, 10baseT, 10baseT-FDX, 
100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 
1000baseT-FDX-master,auto, auto-flow

Jan 22 13:48:28 lucid kernel: cas2: 16kB RX FIFO, 9kB TX FIFO
Jan 22 13:48:28 lucid kernel: cas2: Ethernet address: 00:14:4f:25:ca:12
Jan 22 13:48:28 lucid kernel: cas2: [FILTER]
Jan 22 13:48:28 lucid kernel: cas3: NS DP83065 Saturn Gigabit Ethernet 
mem 0xdfe0-0xdfff irq 32 at device 3.0 on pci4

Jan 22 13:48:28 lucid kernel: cas3: attaching PHYs failed
Jan 22 13:48:28 lucid kernel: cas3: could not be attached
Jan 22 13:48:28 lucid kernel: device_attach: cas3 attach returned 6


If I unload the cas driver, I get the following in /var/log/messages:

Jan 22 14:03:42 lucid kernel: nsgphy0: detached
Jan 22 14:03:42 lucid kernel: miibus2: detached
Jan 22 14:03:42 lucid kernel: cas2: detached
Jan 22 14:03:42 lucid kernel: pci4: network, ethernet at device 2.0 
(no driver attached)



The second time I try to load the cas kernel module after booting 
results in the following output:


# kldload -v if_cas
Loaded if_cas, id=2

and the following logged to /var/log/messages:

Jan 22 14:04:33 lucid kernel: cas0: NS DP83065 Saturn Gigabit Ethernet 
mem 0xdf80-0xdf9f irq 35 at device 0.0 on pci4

Jan 22 14:04:33 lucid kernel: miibus2: MII bus on cas0
Jan 22 14:04:33 lucid kernel: nsgphy0: DP83865 10/100/1000 media 
interface PHY 1 on miibus2
Jan 22 14:04:33 lucid kernel: nsgphy0:  none, 10baseT, 10baseT-FDX, 
100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 
1000baseT-FDX-master,auto, auto-flow

Jan 22 14:04:33 lucid kernel: cas0: 16kB RX FIFO, 9kB TX FIFO
Jan 22 14:04:33 lucid kernel: cas0: Ethernet address: 00:14:4f:25:ca:10
Jan 22 14:04:33 lucid kernel: cas0: [FILTER]
Jan 22 14:04:33 lucid kernel: cas1: NS DP83065 Saturn Gigabit Ethernet 
mem 0xdfa0-0xdfbf irq 34 at device 1.0 on pci4

Jan 22 14:04:33 lucid kernel: miibus3: MII bus on cas1
Jan 22 14:04:33 lucid kernel: nsgphy1: DP83865 10/100/1000 media 
interface PHY 1 on miibus3
Jan 22 14:04:33 lucid kernel: nsgphy1:  none, 10baseT, 10baseT-FDX, 
100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 
1000baseT-FDX-master,auto, auto-flow

Jan 22 14:04:33 lucid kernel: cas1: 16kB RX FIFO, 9kB TX FIFO
Jan 22 14:04:33 lucid kernel: cas1: Ethernet address: 00:14:4f:25:ca:11
Jan 22 14:04:33 lucid kernel: cas1: [FILTER]
Jan 22 14:04:33 lucid kernel: cas2: NS DP83065 Saturn Gigabit Ethernet 
mem 0xdfc0-0xdfdf irq 33 at device 2.0 on pci4

Jan 22 14:04:33 lucid kernel: miibus4: MII bus on cas2
Jan 22 14:04:33 lucid kernel: nsgphy2: DP83865 10/100/1000 media 
interface PHY 1 on miibus4
Jan 22 14:04:33 lucid kernel: nsgphy2:  none, 10baseT, 10baseT-FDX, 
100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 
1000baseT-FDX-master,auto, auto-flow

Jan 22 14:04:33 lucid kernel: cas2: 16kB RX FIFO, 9kB TX FIFO
Jan 22 14:04:33 lucid kernel: cas2: Ethernet address: 00:14:4f:25:ca:12
Jan 22 14:04:33 lucid kernel: cas2: [FILTER]
Jan 22 14:04:33 lucid kernel: cas3: NS DP83065 Saturn Gigabit Ethernet 
mem 0xdfe0-0xdfff irq 32 at device 3.0 on pci4

Jan 22 14:04:33 lucid kernel: miibus5: MII bus on cas3
Jan 22 14:04:33 lucid kernel: nsgphy3: DP83865 10/100/1000 media 
interface PHY 1 on miibus5
Jan 22 14:04:33 lucid kernel: nsgphy3:  none, 10baseT,