Javen,
The following is the capture of PAGE Fault. During debugging, I faced
lots of this kind situation but not at the same location every time. It
seems not in my code, how to debug it?
It occurs when (DDI_INTR_UNCLAIMED) returned from device driver ISR
The thread is in Java/16.
-------------------------------------------------------------------
anic[cpu2]/thread=fffffffecddd7c00: BAD TRAP: type=d (#gp General protection)
rp=ffffff000409f770 addr=baddcafebaddcafe
java: #gp General protection
addr=0xbaddcafebaddcafe
pid=680, pc=0xfffffffffb968b0c, sp=0xffffff000409f860, eflags=0x10287
cr0: 80050033<pg,wp,ne,et,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de>
cr2: 81182c8 cr3: 7fe5b000 cr8: c
rdi: fffffffec2352ac0 rsi: ffffff000409fa40 rdx: fffffffecddd7c00
rcx: ffffff000409fcd0 r8: 57ab0 r9: ffffff000409fa47
rax: 0 rbx: eb93ba72 rbp: ffffff000409f8c0
r10: fffffffff78da5e0 r11: fffffffeccffc057 r12: fffffffecde0d700
r13: 3 r14: fffffffec2352ab0 r15: baddcafebaddcafe
fsb: 0 gsb: fffffffec2cb6a80 ds: 4b
es: 4b fs: 0 gs: 1c3
trp: d err: 0 rip: fffffffffb968b0c
cs: 30 rfl: 10287 rsp: ffffff000409f860
ss: 38
ffffff000409f650 unix:die+ea ()
ffffff000409f760 unix:trap+3ca ()
ffffff000409f770 unix:_cmntrap+e9 ()
ffffff000409f8c0 genunix:dnlc_lookup+9c ()
ffffff000409f950 ufs:ufs_lookup+9f ()
ffffff000409f9e0 genunix:fop_lookup+87 ()
ffffff000409fbf0 genunix:lookuppnvp+2fa ()
ffffff000409fc90 genunix:lookuppnat+125 ()
ffffff000409fd70 genunix:lookupnameat+b9 ()
ffffff000409fe00 genunix:cstatat_getvp+160 ()
ffffff000409fea0 genunix:cstatat64_32+7d ()
ffffff000409fec0 genunix:stat64_32+31 ()
ffffff000409ff10 unix:brand_sys_sysenter+1f2 ()
panic: entering debugger (continue to save dump)
Welcome to kmdb
kmdb: unable to determine terminal type: assuming `vt100'
Loaded modules: [ scsi_vhci crypto cpc uppc neti ptm ufs unix zfs krtld s1394
sppp ipc nca uhci hook lofs genunix ip logindmux usba specfs pcplusmp md random
sctp arp ]
[2]> ::status
debugging live kernel (64-bit) on test
operating system: 5.11 snv_70b (i86pc)
CPU-specific support: Intel Pentium 4 (Prescott)
DTrace state: inactive
stopped on: debugger entry trap
[2]> ::stackregs
fffffffffbc4bcf0 kmdb_enter+0xb()
fffffffffbc4bd20 debug_enter+0x37(fffffffffb8f29f0)
fffffffffbc4bdc0 panicsys+0x402(fffffffffb8f0c10, ffffff000409f5a8,
fffffffffbc4bdd0, 1)
ffffff000409f4e0 vpanic+0x15d()
ffffff000409f5d0 panic+0x9c()
ffffff000409f650 die+0xea(d, ffffff000409f770, baddcafebaddcafe, 2)
ffffff000409f760 trap+0x3ca(ffffff000409f770, baddcafebaddcafe, 2)
ffffff000409f770 0xfffffffffb8001d9()
ffffff000409f8c0 dnlc_lookup+0x9c(fffffffecde0d700, ffffff000409fa40)
ffffff000409f950 ufs`ufs_lookup+0x9f(fffffffecde0d700, ffffff000409fa40,
ffffff000409fa30, ffffff000409fcd0, 0, fffffffec17e0a40, fffffffecccaa970)
ffffff000409f9e0 fop_lookup+0x87(fffffffecde0d700, ffffff000409fa40,
ffffff000409fa30, ffffff000409fcd0, 0, fffffffec17e0a40, fffffffecccaa970)
ffffff000409fbf0 lookuppnvp+0x2fa(ffffff000409fcd0, 0, 1, 0, ffffff000409fe40,
fffffffec17e0a40, fffffffecde0d700, fffffffecccaa970)
ffffff000409fc90 lookuppnat+0x125(ffffff000409fcd0, 0, 1, 0, ffffff000409fe40, 0
)
ffffff000409fd70 lookupnameat+0xb9(89ea960, 0, 1, 0, ffffff000409fe40, 0)
ffffff000409fe00 cstatat_getvp+0x160(ffd19553, 89ea960, 1, 1, ffffff000409fe40,
ffffff000409fe48)
ffffff000409fea0 cstatat64_32+0x7d(ffd19553, 89ea960, 1, ea51d570, 0, 10)
ffffff000409fec0 stat64_32+0x31(89ea960, ea51d570)
ffffff000409ff10 _sys_sysenter_post_swapgs+0x14b()
::threadlist
...
fffffffec3259380 fffffffeccd092e8 fffffffec31a8c10 intrd/1
ffffff00045f6c80 fffffffffbc24f30 0 taskq_thread()
fffffffec2fb7420 fffffffeccd06188 fffffffec300a0f0 java/3
fffffffec325a100 fffffffeccd06188 fffffffec31a3cb0 java/4
fffffffec31285a0 fffffffeccd06188 fffffffec2ff8140 java/5
fffffffec3ad97a0 fffffffeccd06188 fffffffec3a44d00 java/6
fffffffec30d97c0 fffffffeccd06188 fffffffec3000610 java/7
fffffffec31274c0 fffffffeccd06188 fffffffec31b61d0 java/8
fffffffec312c220 fffffffeccd06188 fffffffec2ffb130 java/9
fffffffec32596e0 fffffffeccd06188 fffffffec31a8110 java/10
fffffffec3334480 fffffffeccd06188 fffffffec2cf4630 java/11
fffffffec3ada1c0 fffffffeccd06188 fffffffec31b2c60 java/12
fffffffec3ad9440 fffffffeccd06188 fffffffec3a44780 java/13
fffffffec3ad90e0 fffffffeccd06188 fffffffec3a44200 java/14
fffffffec3128240 fffffffeccd06188 fffffffec2ff7bc0 java/15
fffffffecddd7c00 fffffffeccd06188 fffffffec3a43c80 java/16
fffffffecddd78a0 fffffffeccd06188 fffffffec3a43700 java/17
fffffffecddd7540 fffffffeccd06188 fffffffec3a43180 java/18
fffffffecddd71e0 fffffffeccd06188 fffffffec3a39810 java/19
fffffffecddd6e80 fffffffeccd06188 fffffffec3a39290 java/20
fffffffecddd6b20 fffffffeccd06188 fffffffec3a38d10 java/21
fffffffecddd67c0 fffffffeccd06188 fffffffec3a38790 java/22
fffffffecddd6460 fffffffeccd06188 fffffffec3a38210 java/23
fffffffecddd6100 fffffffeccd06188 fffffffec3a37c90 java/24
fffffffecddd5c20 fffffffeccd06188 fffffffec3a37710 java/25
fffffffecddd58c0 fffffffeccd06188 fffffffec3a37190 java/26
fffffffecddd5560 fffffffeccd06188 fffffffec3a36820 java/27
fffffffecddd5200 fffffffeccd06188 fffffffec3a362a0 java/28
-------------------------------------------------------------------
Can you give me some advice?
Thanks
Steve
-----Original Message-----
From: Javen Wu [mailto:[EMAIL PROTECTED]
Sent: Tuesday, April 01, 2008 7:19 PM
To: Steve Chang
Cc: [email protected]
Subject: Re: [driver-discuss] SCSI HBA driver debugging questions
Steve Chang wrote:
>Javen,
>But will SD driver and SES driver handle the same
>way since my HBA is for hard disk?
>
No. sd(7D)/ses(7D) handle disk and enclosure device in different way.
You can take a look at sdattach() and ses_attach() which do different
thing, so that's why you see different command sequence.
Oh? I don't think your HBA driver only support hard disk.
I remembered you mentioned your HBA accept SCSI CDB. So you just bypass
the SPC/SBC/SSC/SMC command to HBA through your driver. So you can see
your HBA driver detects a SES and attached successfully.
> Or I don't need
>to care about it?
>
>
For your HBA driver, so far you don't need care the different type of
SCSI devices. Your HBA driver just bypass SCSI CDB regardless what kind
of command and target device is.
Javen
>
>Steve
>
>
>-----Original Message-----
>From: Javen Wu [mailto:[EMAIL PROTECTED]
>Sent: Tuesday, April 01, 2008 5:30 PM
>To: Steve Chang
>Cc: [email protected]
>Subject: Re: [driver-discuss] SCSI HBA driver debugging questions
>
>Steve,
>
>I noticed that below two sentances:Your first test:
>
>==> /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED]/[EMAIL PROTECTED],[EMAIL
>PROTECTED]/[EMAIL PROTECTED],0
>
>That meas target0 lun0 of [EMAIL PROTECTED],[EMAIL PROTECTED] is a disk, so
>the following
>command sequence issued by sd driver.
>
>But second time,
>==> ses0 is /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci1103,[EMAIL
>PROTECTED]/[EMAIL PROTECTED],0
>
>That means target0 lun0 of *pci1103,[EMAIL PROTECTED] is a enclosure device.
>The
>following command sequence issued by ses driver.
>
>As I said before, solaris configuration is a multi-threading, so the any
>of your two instances(two channel) can be configured firstly.
>Actually, the squence of configuring two instances([EMAIL PROTECTED],[EMAIL
>PROTECTED] ,
>*pci1103,[EMAIL PROTECTED]) is random. I don't see any problem here.
>
>Javen
>
>Steve Chang 写道:
>
>
>>Javen,
>>Here is a question -
>>Two disks are connected on HBA channels(channel 0 & 1) and
>>run two times (reboot system and start the test) but got
>>two different results.
>>
>>case I: kernel processes ID0 first then ID1 scanning
>> ID0 cmd flow ==> 12 -12 -00 -5a -46 -46 -1b -25 -25 -00 -1b -00 -1a
>>-5e
>> ==> sd1 at hptiop0
>> ==> /[EMAIL PROTECTED],0/pci8086,[EMAIL
>> PROTECTED]/[EMAIL PROTECTED],[EMAIL PROTECTED]/[EMAIL PROTECTED],0
>> Then ID1 cmd flow ==> 12 -12 -00 -5a -46 -46 -1b -25 -25 -00 -1b -00 -1a
>>-5e
>> ==> sd2 at hptiop0
>> ==> /[EMAIL PROTECTED],0/pci8086,[EMAIL
>> PROTECTED]/[EMAIL PROTECTED],[EMAIL PROTECTED]/[EMAIL PROTECTED],0
>>
>>
>>Case II: kernel processes ID0, ID1 ,, ID15 then back to ID0 again
>> Cmd flow [cmd(id)] ==> 12(0) --12(1) --12(2) ... -12(15)
>> Then cmd(id) ==> 12(0) --12(0) --1c(0)
>> ==> ses 0 at hptiop0:target0/lun0
>> ==> ses0 is /[EMAIL PROTECTED],0/pci8086,[EMAIL
>> PROTECTED]/pci1103,[EMAIL PROTECTED]/[EMAIL PROTECTED],0
>>
>>
>>Q: Why same procedure but can't get a consistent result during test?
>> Does it mean the return value of inquiry(0x12) incorrect?
>>
>>
>>Regards,
>>Steve
>>
>>
>>-----Original Message-----
>>From: Javen Wu [mailto:[EMAIL PROTECTED]
>>Sent: Thursday, March 27, 2008 6:45 PM
>>To: Steve Chang
>>Cc: [email protected]
>>Subject: Re: [driver-discuss] SCSI HBA driver debugging questions
>>
>>The question is whether you remove the interrupt in your detach routine?
>>I don't know which ddi interface you used.
>>If you used ddi_intr_add_handler() in your attach routine, did you call
>>ddi_intr_remove_handler() in your detach routine?
>>
>>Javen
>>
>>Steve Chang wrote:
>>
>>
>>
>>>Javen,
>>>During debugging, I found some issue not related to my driver
>>>
>>>(1) use the following to remove previous attached driver
>>> #rem_drv mydriver
>>>(2) Then use the following to add driver immediately
>>> #add_drv -i '"pci1103,0"' -c scsi mydriver
>>> (Where pci1103,0 is HBA PCI ID)
>>>
>>> The kernel go panic immediately, and from the trace, kernel just
>>> dispatches the interrupt to my ISR which is a NULL pointer so
>>> kernel panic. How come kernel doestn't start from _init() procedure
>>> and still remember my ISR entry point?
>>>
>>> Does it mean that "#rem_drv mydriver" can't clean up attached
>>> Driver?
>>>
>>>
>>>Thanks
>>>Steve
>>>
>>>
>>>
>>>-----Original Message-----
>>>From: Javen Wu [mailto:[EMAIL PROTECTED]
>>>Sent: Thursday, March 27, 2008 1:35 AM
>>>To: Steve Chang
>>>Cc: [email protected]
>>>Subject: Re: [driver-discuss] SCSI HBA driver debugging questions
>>>
>>>Setup serial console:
>>>
>>>1. connect the serial ports between host and client(the system with your
>>>debug version driver).
>>>2. change client side:
>>>change /boot/solaris/bootenv.rc: change the console from 'text' to 'ttya'
>>>example:
>>>setprop console 'ttya'
>>>I assume you connect ttya of the client.
>>>
>>>3. change host side: /etc/remote, and add one existing line or add a
>>>line. below is a example:
>>>hardwire:\
>>> :dv=/dev/term/a:br#9600:el=^C^S^Q^U^D:ie=%$:oe=^D:
>>>
>>>4. change grub menulist of client to redirect GRUB:
>>>add below two sentence to your /boot/grub/menu.lst:
>>>
>>> serial --unit=0 --speed=9600
>>> terminal serial
>>>
>>>
>>>Above /dev/term/a means I connect to ttya of host side.
>>>5. on your host terminal run: "tip hardwire".
>>>
>>>you can enable kmdb by default by add -k option to your grub menulist like:
>>>#---------- ADDED BY BOOTADM - DO NOT EDIT ----------
>>>title Solaris Express Community Edition snv_84 X86
>>>kernel$ /platform/i86pc/kernel/$ISADIR/unix -k
>>>module$ /platform/i86pc/$ISADIR/boot_archive
>>>#---------------------END BOOTADM--------------------
>>>
>>>
>>>Good luck!
>>>Javen
>>>
>>>Steve Chang wrote:
>>>
>>>
>>>
>>>
>>>
>>>>Javen,
>>>>Can you instruct me how to set up kmdb and serial connection?
>>>>I can't figure out the instruction in doc "Writing Device Driver"
>>>>
>>>>My platform is "x86pc"
>>>>(1) During booting, I select "e" and change 1st item to boot
>>>> with Kmdb with "-k" but the system boot up maintenance
>>>> mode. How to configure it as you said?
>>>>
>>>>(2) As for setting serial port, on host side, I add "ttya -debug"
>>>> And enable it with "tip debug". On target side, I use "eeprom
>>>> Output-device=ttya" but no response with reboot or after panic
>>>> To get output on "host system"
>>>>
>>>>Thanks
>>>>
>>>>Steve
>>>>
>>>>
>>>>
>>>>-----Original Message-----
>>>>From: Javen Wu [mailto:[EMAIL PROTECTED]
>>>>Sent: Monday, March 24, 2008 2:16 AM
>>>>To: Steve Chang
>>>>Cc: [email protected]
>>>>Subject: Re: [driver-discuss] SCSI HBA driver debugging questions
>>>>
>>>>Steve,
>>>>
>>>>Firstly, I saw several panic in your attached sys-log. The panics were
>>>>caused most likely by your driver.
>>>>
>>>>I cannot understand what's your meaning about "kernel stop" "Locks-up"?
>>>>Did you mean "Panic" and "Hang"?
>>>>
>>>>
>>>>
>>>>From your syslog, I cannot give any comments. But I can give you some
>>>
>>>
>>>>suggestion for debugging your HBA driver.
>>>>
>>>>1. Before your driver get stable, please don't copy you debug version
>>>>driver to /kernel/drv or /usr/kernel/drv
>>>>because once your driver is with panic bug, you will panic again and
>>>>again which prevent you booting up system successfully.
>>>>So please just create a link under /kernel/drv/ or /usr/kernel/drv/
>>>>which point to a binary locates at /tmp. Before you load your driver,
>>>>copy your binary to /tmp. The contents under directory "/tmp" cannot
>>>>across reboot, that means once panic causes reboot, the link points to a
>>>>NULL file locates /tmp after reboot, you can boot your system
>>>>
>>>>
>>successfully.
>>
>>
>>>>2. Please enable kmdb during debug. you can use -k option to boot your
>>>>system. Once meet panic, the system would freeze and you can do live
>>>>analyze or save core dump for post-analyze.
>>>>
>>>>3. In case a hang problem, you can try break the system enter into kmdb
>>>>mode or login to the machine by ssh from another machine and run "mdb
>>>>-KF". Then force save a core dump or do live debugging to check current
>>>>thread list and see where the system hang. "$threadlist" is very helpful
>>>>to show threads and stack of the threads.
>>>>
>>>>Hope it helps your debugging.
>>>>
>>>>Cheers
>>>>Javen
>>>>
>>>>
>>>>
>>>>Steve Chang wrote:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>Dear Javen,
>>>>>I've struggled on debugging for a while. Can you point out what's wrong
>>>>>through the
>>>>>attached file(through var/adm/messages)?
>>>>>
>>>>>My target system is a dual Intel Xeon server board platform and install a
>>>>>Solaris Developer Extension version (09/07). During debugging,
>>>>>(1) I put mydriver to /usr/kernel/drv/amd64 and mydriver.conf to
>>>>>/usr/kernel/drv
>>>>>(2) Use "prtconf" to check the HBA PCI id which I found our HBA
>>>>>
>>>>>
>>>>>
>>>>>
>>>"pci1103,0"
>>>
>>>
>>>
>>>
>>>>> (driver not attached)
>>>>>(3) Install driver (as a superuser)
>>>>> #add_drv -i '"pci1103,0"' -c scsi mydriver
>>>>>(4) Then system locks up
>>>>>(5) Fix the kernel and check /var/adm/messages
>>>>>
>>>>>There are two text files in this attached which run the same procedures
>>>>>
>>>>>
>>>>>
>>>>>
>>>two
>>>
>>>
>>>
>>>
>>>>>times
>>>>>But get the different result.
>>>>>1. DEB - the 1st time running
>>>>> Kerenl stops after ID=9 scsi_hba_probe() return ??
>>>>>2. DEB - fix the system and run the same procedure again
>>>>> Kernel send ID=14 directly and locks up after ID=15
>>>>>scsi_hba_probe() return ??
>>>>>
>>>>>What's wrong of the kernel? Is it caused by bad kernel or my code? If I
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>keep
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>fixing the
>>>>>kernel and retry again, I'll get the different result lock-up again. It
>>>>>
>>>>>
>>is
>>
>>
>>>>>bothering me
>>>>>since I cannot debug my driver. How to make it more stable to run the
>>>>>debugging?
>>>>>
>>>>>Thanks
>>>>>
>>>>>Steve Chang
>>>>>HighPoint Technologies, Inc.
>>>>>408-240-6115
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>
>
>
>
>
_______________________________________________
driver-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/driver-discuss