Re: Panic in ieee80211 tx mgmt timeout

2011-06-29 Thread Bernhard Schmidt
On Wednesday, June 29, 2011 03:50:08 Adrian Chadd wrote:
 This is kinda strange; that symbol doesn't exist in the net80211 or ath 
 source.
 
 What the heck?
 
 
 
 adrian
 
 
 
 On 28 June 2011 17:28, Stefan Esser st_es...@t-online.de wrote:
  Hi,
 
  is this a known issue?
 
  My -CURRENT system (r223560M, amd64, 8GB, Atheros WLAN) panics after
  minutes to hours of uptime with the following message:
 
  Fatal trap 12: page fault while in kernel mode
  cpuid = 0; apic id = 0
  fault virtual address   = 0xff807f502000
  fault code  = supervisor data read, page not present
  ...
  processor eflags= interrupt enabled, resume, IOPL = 0
  current process = 11 (swi4: clock)
  [ thread pid 11 tid 112 ]
  Stopped at  ieee80211_tx_mgmt_timeout+0x1:  movq (%rdi),%rdi
 
  db bt
  Tracing pid 11 tid 100012 td 0xfe00032e
  ieee80211_tx_mgmt_timeout() at ieee80211_tx_mgmt_timeout+0x1
  intr_event_execute_handlers() at intr_event_execute_handlers+0x66
  ithread_loop() at ithread_loop+0x96
  fork_exit() at fork_exit+0x11d
  fork_trampoline() at fork_trampoline+0xe
  --- trap 0, rip = 0, rsp = 0xff8000288d00, rbp = 0 ---
 
  This panic message is manually transcribed, since the GPT-only
  partitioning prevents dumping of a kernel core. (Why, BTW?)
  I could add a swap partition on a MBR disk, if a core dump seems
  neccessary to diagnose the problem. I'm also willing to wait for that
  panic to occur again and to gather more debug output.
 
 
  Other information: The Atheros WLAN in this system is unused (not
  associated) but both ath0 and wlan0 were UP at the time of the panic.
 
  Initial testing shows the system to be stable with both wlan0 and ath0
  set to down after boot. But still, the timeout should not panic the
  kernel, if WLAN is active but not fully configured (e.g. no SSID).
 
  Any ideas?

It's name is ieee80211_tx_mgt_timeout used to track AUTH/ASSOC
requests. Afaik there is even a similar PR about that.

Adrian, you've got a AP set up to drop either a AUTH or ASSOC
response frame?

-- 
Bernhard
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Panic in ieee80211 tx mgmt timeout

2011-06-29 Thread Adrian Chadd
On 29 June 2011 14:03, Bernhard Schmidt bschm...@freebsd.org wrote:

 It's name is ieee80211_tx_mgt_timeout used to track AUTH/ASSOC
 requests. Afaik there is even a similar PR about that.

 Adrian, you've got a AP set up to drop either a AUTH or ASSOC
 response frame?

Tell me how and I'll set it up.

A panic at that point in the function indicates maybe ni is NULL?
or ni-vap is now NULL, maybe?



Adrian
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Panic in ieee80211 tx mgmt timeout

2011-06-29 Thread Bernhard Schmidt
On Wednesday, June 29, 2011 10:03:02 Adrian Chadd wrote:
 On 29 June 2011 14:03, Bernhard Schmidt bschm...@freebsd.org wrote:
 
  It's name is ieee80211_tx_mgt_timeout used to track AUTH/ASSOC
  requests. Afaik there is even a similar PR about that.
 
  Adrian, you've got a AP set up to drop either a AUTH or ASSOC
  response frame?
 
 Tell me how and I'll set it up.
 
 A panic at that point in the function indicates maybe ni is NULL?
 or ni-vap is now NULL, maybe?

vap should never be NULL, so, I'd guess it's ni.

Hmm.. I'd guess there is some kind of racy behavior, if the driver is
telling us that it was able to send the AUTH req frame, net80211 sets
up the timeout callback. What happens if the AUTH resp as well as the
callback hit at the same time? It should be locked appropriately, but
is it?

This will drop the AUTH response:

Index: sys/net80211/ieee80211_hostap.c
===
--- sys/net80211/ieee80211_hostap.c (revision 223661)
+++ sys/net80211/ieee80211_hostap.c (working copy)
@@ -978,7 +978,7 @@ hostap_auth_open(struct ieee80211_node *ni, struct
%s, station authentication defered (radius acl));
ieee80211_notify_node_auth(ni);
} else {
-   IEEE80211_SEND_MGMT(ni, IEEE80211_FC0_SUBTYPE_AUTH, seq + 1);
+   //IEEE80211_SEND_MGMT(ni, IEEE80211_FC0_SUBTYPE_AUTH, seq + 1);
IEEE80211_NOTE_MAC(vap,
IEEE80211_MSG_DEBUG | IEEE80211_MSG_AUTH, ni-ni_macaddr,
%s, station authenticated (open));
@@ -1158,7 +1158,7 @@ hostap_auth_shared(struct ieee80211_node *ni, stru
estatus = IEEE80211_STATUS_SEQUENCE;
goto bad;
}
-   IEEE80211_SEND_MGMT(ni, IEEE80211_FC0_SUBTYPE_AUTH, seq + 1);
+   //IEEE80211_SEND_MGMT(ni, IEEE80211_FC0_SUBTYPE_AUTH, seq + 1);
return;
 bad:
/*


-- 
Bernhard
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Panic in ieee80211 tx mgmt timeout

2011-06-29 Thread Stefan Esser
Am 29.06.2011 10:03, schrieb Adrian Chadd:
 On 29 June 2011 14:03, Bernhard Schmidt bschm...@freebsd.org wrote:
 It's name is ieee80211_tx_mgt_timeout used to track AUTH/ASSOC
 requests. Afaik there is even a similar PR about that.

Sorry, I manually entered the panic message, since dumps were not
working on my system at the time of that panic.

 Adrian, you've got a AP set up to drop either a AUTH or ASSOC
 response frame?

I've got a number of AUTH - SCAN transition lost messages for wlan0,
seconds to minutes apart:

Jun 28 21:16:17 kernel: wlan0: ieee80211_new_state_locked: pending AUTH
- SCAN transition lost
Jun 28 21:34:46 kernel: wlan0: ieee80211_new_state_locked: pending AUTH
- SCAN transition lost
Jun 28 21:36:33 kernel: wlan0: ieee80211_new_state_locked: pending AUTH
- SCAN transition lost
Jun 28 21:45:14 kernel: wlan0: ieee80211_new_state_locked: pending AUTH
- SCAN transition lost
Jun 28 21:45:44 kernel: wlan0: ieee80211_new_state_locked: pending AUTH
- SCAN transition lost

The setup is easy to reproduce, my rc.conf contained:

wlans_ath0=wlan0
ifconfig_ath0=down
ifconfig_wlan0=down
wpa_supplicant_enable=YES

This system used to be connected via ath0, but recently was moved to a
place where Ethernet is available. The panics started only after WLAN
was not used anymore. I might disable wpa_supplicant, since it is not
required in the current situation, but did not try whether that helps
prevent the panic.

 Tell me how and I'll set it up.
 
 A panic at that point in the function indicates maybe ni is NULL?
 or ni-vap is now NULL, maybe?

I recreated the panic, this time with kernel dumps correctly configured
(thanks for the hint, Scott). The panic message is:

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0xff809c7a1000
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x805e1851
stack pointer   = 0x28:0xff8000288ab0
frame pointer   = 0x28:0xff8000288b60
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 11 (swi4: clock)

Traceback:

#10 0x805e1851 in ieee80211_tx_mgt_timeout (arg=0xff809c7a1000)
at ../../../net80211/ieee80211_output.c:2487

This indicates, that an invalid argument is passed and assigned to
*ni, which causes the page fault when dereferencing ni to obtain *va.

I'm afraid that the assumption in the comment (about timeout being save
to use) does not really hold:

static void
ieee80211_tx_mgt_timeout(void *arg)
{
struct ieee80211_node *ni = arg;
struct ieee80211vap *vap = ni-ni_vap;

if (vap-iv_state != IEEE80211_S_INIT 
(vap-iv_ic-ic_flags  IEEE80211_F_SCAN) == 0) {
/*
 * NB: it's safe to specify a timeout as the reason here;
 * it'll only be used in the right state.
 */
ieee80211_new_state(vap, IEEE80211_S_SCAN,
IEEE80211_SCAN_FAIL_TIMEOUT)*vap ;
}
}

If vap is valid during one invocation of that function, I'd expect it
to at least be a pointer to valid kernel memory after the timeout.
I.e., the value found by dereferencing it may be stale, but the pointer
itself should at least not cause a page fault. (???)


The compressed core.txt is 27KB, the compressed vmcore is 800MB. I might
be able to find a place to upload the vmcore file to, but since I'm
currently on a DSL with only 672KBit/s upstream, it would take me some 3
hours to upload to a better connected server (and I'd like to avoid
doing that, if not essential for debugging).

The core.txt is small enough to send by mail. Let me know if you think
it helps you understand the problem.


I'm willing to support debugging, e.g. by placement of printfs in my
kernel for the timeout handler and the creation and destruction of *vap
structures.


After removal of wlans_ath0=wlan0 the system will most probably be
stable, I did not specifically test this case (i.e. ath0 configured, but
no wlan0 created). I do know, that an ifconfig down of ath0 and wlan0
suffices; probably an ifconfig wlan0 down alone would be enough.

So, I know how to avoid the panic, but I think it is still important to
find the cause.

Thank you for looking into this!


Best regards, STefan
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Panic in ieee80211 tx mgmt timeout

2011-06-29 Thread Stefan Esser
On 29.06.2011 10:27, Bernhard Schmidt wrote:
 On Wednesday, June 29, 2011 10:03:02 Adrian Chadd wrote:
 On 29 June 2011 14:03, Bernhard Schmidt bschm...@freebsd.org wrote:

 It's name is ieee80211_tx_mgt_timeout used to track AUTH/ASSOC
 requests. Afaik there is even a similar PR about that.

 Adrian, you've got a AP set up to drop either a AUTH or ASSOC
 response frame?

 Tell me how and I'll set it up.

 A panic at that point in the function indicates maybe ni is NULL?
 or ni-vap is now NULL, maybe?
 
 vap should never be NULL, so, I'd guess it's ni.

No, neither vap no vap-ni appear to cause NULL dereferences.

The panic message indicates a fault address of 0xff809c7a1000, which
is the value of arg passed to ieee80211_tx_mgt_timeout().

The fault occurs on the first instruction within that function and I
take this to mean, that it points outside kernel VM space. (I have got
to admit, that I do not know the exact memory layout for amd64, though.)

 Hmm.. I'd guess there is some kind of racy behavior, if the driver is
 telling us that it was able to send the AUTH req frame, net80211 sets
 up the timeout callback. What happens if the AUTH resp as well as the
 callback hit at the same time? It should be locked appropriately, but
 is it?
 
 This will drop the AUTH response:

I have received a number of messages that might indicate a lost race:

ieee80211_new_state_locked: pending AUTH - SCAN transition lost

repeats with between a few seconds and 20 minutes between messages.

 Index: sys/net80211/ieee80211_hostap.c
 ===
 --- sys/net80211/ieee80211_hostap.c   (revision 223661)
 +++ sys/net80211/ieee80211_hostap.c   (working copy)
 @@ -978,7 +978,7 @@ hostap_auth_open(struct ieee80211_node *ni, struct
   %s, station authentication defered (radius acl));
   ieee80211_notify_node_auth(ni);
   } else {
 - IEEE80211_SEND_MGMT(ni, IEEE80211_FC0_SUBTYPE_AUTH, seq + 1);
 + //IEEE80211_SEND_MGMT(ni, IEEE80211_FC0_SUBTYPE_AUTH, seq + 1);
   IEEE80211_NOTE_MAC(vap,
   IEEE80211_MSG_DEBUG | IEEE80211_MSG_AUTH, ni-ni_macaddr,
   %s, station authenticated (open));
 @@ -1158,7 +1158,7 @@ hostap_auth_shared(struct ieee80211_node *ni, stru
   estatus = IEEE80211_STATUS_SEQUENCE;
   goto bad;
   }
 - IEEE80211_SEND_MGMT(ni, IEEE80211_FC0_SUBTYPE_AUTH, seq + 1);
 + //IEEE80211_SEND_MGMT(ni, IEEE80211_FC0_SUBTYPE_AUTH, seq + 1);
   return;
  bad:
   /*
 
 

I could try that patch for a few hours ...

Regards, STefan
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Panic in ieee80211 tx mgmt timeout

2011-06-29 Thread Adrian Chadd
The question here is - what context is the callback being called in?

The lack of net80211 locking has me confused and sad. :/


Adrian

On 29 June 2011 16:27, Bernhard Schmidt bschm...@freebsd.org wrote:
 On Wednesday, June 29, 2011 10:03:02 Adrian Chadd wrote:
 On 29 June 2011 14:03, Bernhard Schmidt bschm...@freebsd.org wrote:

  It's name is ieee80211_tx_mgt_timeout used to track AUTH/ASSOC
  requests. Afaik there is even a similar PR about that.
 
  Adrian, you've got a AP set up to drop either a AUTH or ASSOC
  response frame?

 Tell me how and I'll set it up.

 A panic at that point in the function indicates maybe ni is NULL?
 or ni-vap is now NULL, maybe?

 vap should never be NULL, so, I'd guess it's ni.

 Hmm.. I'd guess there is some kind of racy behavior, if the driver is
 telling us that it was able to send the AUTH req frame, net80211 sets
 up the timeout callback. What happens if the AUTH resp as well as the
 callback hit at the same time? It should be locked appropriately, but
 is it?

 This will drop the AUTH response:

 Index: sys/net80211/ieee80211_hostap.c
 ===
 --- sys/net80211/ieee80211_hostap.c     (revision 223661)
 +++ sys/net80211/ieee80211_hostap.c     (working copy)
 @@ -978,7 +978,7 @@ hostap_auth_open(struct ieee80211_node *ni, struct
                    %s, station authentication defered (radius acl));
                ieee80211_notify_node_auth(ni);
        } else {
 -               IEEE80211_SEND_MGMT(ni, IEEE80211_FC0_SUBTYPE_AUTH, seq + 1);
 +               //IEEE80211_SEND_MGMT(ni, IEEE80211_FC0_SUBTYPE_AUTH, seq + 
 1);
                IEEE80211_NOTE_MAC(vap,
                    IEEE80211_MSG_DEBUG | IEEE80211_MSG_AUTH, ni-ni_macaddr,
                    %s, station authenticated (open));
 @@ -1158,7 +1158,7 @@ hostap_auth_shared(struct ieee80211_node *ni, stru
                estatus = IEEE80211_STATUS_SEQUENCE;
                goto bad;
        }
 -       IEEE80211_SEND_MGMT(ni, IEEE80211_FC0_SUBTYPE_AUTH, seq + 1);
 +       //IEEE80211_SEND_MGMT(ni, IEEE80211_FC0_SUBTYPE_AUTH, seq + 1);
        return;
  bad:
        /*


 --
 Bernhard

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Panic in ieee80211 tx mgmt timeout

2011-06-29 Thread Bernhard Schmidt
On Wednesday, June 29, 2011 10:53:41 Stefan Esser wrote:
 Am 29.06.2011 10:03, schrieb Adrian Chadd:
  On 29 June 2011 14:03, Bernhard Schmidt bschm...@freebsd.org wrote:
  It's name is ieee80211_tx_mgt_timeout used to track AUTH/ASSOC
  requests. Afaik there is even a similar PR about that.
 
 Sorry, I manually entered the panic message, since dumps were not
 working on my system at the time of that panic.
 
  Adrian, you've got a AP set up to drop either a AUTH or ASSOC
  response frame?
 
 I've got a number of AUTH - SCAN transition lost messages for wlan0,
 seconds to minutes apart:
 
 Jun 28 21:16:17 kernel: wlan0: ieee80211_new_state_locked: pending AUTH
 - SCAN transition lost
 Jun 28 21:34:46 kernel: wlan0: ieee80211_new_state_locked: pending AUTH
 - SCAN transition lost
 Jun 28 21:36:33 kernel: wlan0: ieee80211_new_state_locked: pending AUTH
 - SCAN transition lost
 Jun 28 21:45:14 kernel: wlan0: ieee80211_new_state_locked: pending AUTH
 - SCAN transition lost
 Jun 28 21:45:44 kernel: wlan0: ieee80211_new_state_locked: pending AUTH
 - SCAN transition lost
 
 The setup is easy to reproduce, my rc.conf contained:
 
 wlans_ath0=wlan0
 ifconfig_ath0=down
 ifconfig_wlan0=down
 wpa_supplicant_enable=YES

Strip the last 3 lines, don't ever fiddle around with ath0 directly.
This configuration always starts wpa_supplicant.

 This system used to be connected via ath0, but recently was moved to a
 place where Ethernet is available. The panics started only after WLAN
 was not used anymore. I might disable wpa_supplicant, since it is not
 required in the current situation, but did not try whether that helps
 prevent the panic.
 
  Tell me how and I'll set it up.
  
  A panic at that point in the function indicates maybe ni is NULL?
  or ni-vap is now NULL, maybe?
 
 I recreated the panic, this time with kernel dumps correctly configured
 (thanks for the hint, Scott). The panic message is:
 
 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; apic id = 00
 fault virtual address   = 0xff809c7a1000
 fault code  = supervisor read data, page not present
 instruction pointer = 0x20:0x805e1851
 stack pointer   = 0x28:0xff8000288ab0
 frame pointer   = 0x28:0xff8000288b60
 code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, long 1, def32 0, gran 1
 processor eflags= interrupt enabled, resume, IOPL = 0
 current process = 11 (swi4: clock)
 
 Traceback:
 
 #10 0x805e1851 in ieee80211_tx_mgt_timeout (arg=0xff809c7a1000)
 at ../../../net80211/ieee80211_output.c:2487
 
 This indicates, that an invalid argument is passed and assigned to
 *ni, which causes the page fault when dereferencing ni to obtain *va.

The problem here seems to be wpa_supplicant. It can try to associate
at any given point in time which results in the BSS ni being destroyed,
though it might still be referenced somewhere (In this case the timeout
stuff, or better said ath's TX queue). Not clearing the reference (or
stopping whatever is using it) is the fault here. Now how to figure out
who the caller is? Got the complete backtrace?

-- 
Bernhard
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Panic in ieee80211 tx mgmt timeout

2011-06-29 Thread Stefan Esser
Am 29.06.2011 12:41, schrieb Bernhard Schmidt:
 On Wednesday, June 29, 2011 10:53:41 Stefan Esser wrote:
 I recreated the panic, this time with kernel dumps correctly configured
 (thanks for the hint, Scott). The panic message is:

 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; apic id = 00
 fault virtual address   = 0xff809c7a1000
 fault code  = supervisor read data, page not present
 instruction pointer = 0x20:0x805e1851
 stack pointer   = 0x28:0xff8000288ab0
 frame pointer   = 0x28:0xff8000288b60
 code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, long 1, def32 0, gran 1
 processor eflags= interrupt enabled, resume, IOPL = 0
 current process = 11 (swi4: clock)

 Traceback:

 #10 0x805e1851 in ieee80211_tx_mgt_timeout (arg=0xff809c7a1000)
 at ../../../net80211/ieee80211_output.c:2487

 This indicates, that an invalid argument is passed and assigned to
 *ni, which causes the page fault when dereferencing ni to obtain *va.
 
 The problem here seems to be wpa_supplicant. It can try to associate
 at any given point in time which results in the BSS ni being destroyed,
 though it might still be referenced somewhere (In this case the timeout
 stuff, or better said ath's TX queue). Not clearing the reference (or
 stopping whatever is using it) is the fault here. Now how to figure out
 who the caller is? Got the complete backtrace?

Not sure that I understand your question correctly ...

#10 0x805e1851 in ieee80211_tx_mgt_timeout
(arg=0xff809c7a1000) at ../../../net80211/ieee80211_output.c:2487
#11 0x8050f45c in softclock (arg=Variable arg is not
available.) at ../../../kern/kern_timeout.c:564
#12 0x804d9876 in intr_event_execute_handlers (p=Variable p is
not available.) at ../../../kern/kern_intr.c:1257
#13 0x804da4d6 in ithread_loop (arg=0xfe00032dcc60) at
../../../kern/kern_intr.c:1270
#14 0x804d718d in fork_exit (callout=0x804da440
ithread_loop, arg=0xfe00032dcc60, frame=0xff8000288c50) at
../../../kern/kern_fork.c:920
#15 0x807258ce in fork_trampoline () at
../../../amd64/amd64/exception.S:603

Bernhard, I'm sending you the compressed core.txt in private mail.

Regards, STefan
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Panic in ieee80211 tx mgmt timeout

2011-06-28 Thread Stefan Esser
Hi,

is this a known issue?

My -CURRENT system (r223560M, amd64, 8GB, Atheros WLAN) panics after
minutes to hours of uptime with the following message:

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 0
fault virtual address   = 0xff807f502000
fault code  = supervisor data read, page not present
...
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 11 (swi4: clock)
[ thread pid 11 tid 112 ]
Stopped at  ieee80211_tx_mgmt_timeout+0x1:  movq (%rdi),%rdi

db bt
Tracing pid 11 tid 100012 td 0xfe00032e
ieee80211_tx_mgmt_timeout() at ieee80211_tx_mgmt_timeout+0x1
intr_event_execute_handlers() at intr_event_execute_handlers+0x66
ithread_loop() at ithread_loop+0x96
fork_exit() at fork_exit+0x11d
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xff8000288d00, rbp = 0 ---

This panic message is manually transcribed, since the GPT-only
partitioning prevents dumping of a kernel core. (Why, BTW?)
I could add a swap partition on a MBR disk, if a core dump seems
neccessary to diagnose the problem. I'm also willing to wait for that
panic to occur again and to gather more debug output.


Other information: The Atheros WLAN in this system is unused (not
associated) but both ath0 and wlan0 were UP at the time of the panic.

Initial testing shows the system to be stable with both wlan0 and ath0
set to down after boot. But still, the timeout should not panic the
kernel, if WLAN is active but not fully configured (e.g. no SSID).

Any ideas?

Best regards, STefan
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Panic in ieee80211 tx mgmt timeout

2011-06-28 Thread Scot Hetzel
On Tue, Jun 28, 2011 at 4:28 AM, Stefan Esser st_es...@t-online.de wrote:
 This panic message is manually transcribed, since the GPT-only
 partitioning prevents dumping of a kernel core. (Why, BTW?)

You should be able to get a kernel core dump on a system with a GPT
partitioned disk.

Do you have a freebsd-swap partition?

How is your GPT disk partitioned?

Scot
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Panic in ieee80211 tx mgmt timeout

2011-06-28 Thread Adrian Chadd
This is kinda strange; that symbol doesn't exist in the net80211 or ath source.

What the heck?



adrian



On 28 June 2011 17:28, Stefan Esser st_es...@t-online.de wrote:
 Hi,

 is this a known issue?

 My -CURRENT system (r223560M, amd64, 8GB, Atheros WLAN) panics after
 minutes to hours of uptime with the following message:

 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; apic id = 0
 fault virtual address   = 0xff807f502000
 fault code              = supervisor data read, page not present
 ...
 processor eflags        = interrupt enabled, resume, IOPL = 0
 current process         = 11 (swi4: clock)
 [ thread pid 11 tid 112 ]
 Stopped at      ieee80211_tx_mgmt_timeout+0x1:  movq     (%rdi),%rdi

 db bt
 Tracing pid 11 tid 100012 td 0xfe00032e
 ieee80211_tx_mgmt_timeout() at ieee80211_tx_mgmt_timeout+0x1
 intr_event_execute_handlers() at intr_event_execute_handlers+0x66
 ithread_loop() at ithread_loop+0x96
 fork_exit() at fork_exit+0x11d
 fork_trampoline() at fork_trampoline+0xe
 --- trap 0, rip = 0, rsp = 0xff8000288d00, rbp = 0 ---

 This panic message is manually transcribed, since the GPT-only
 partitioning prevents dumping of a kernel core. (Why, BTW?)
 I could add a swap partition on a MBR disk, if a core dump seems
 neccessary to diagnose the problem. I'm also willing to wait for that
 panic to occur again and to gather more debug output.


 Other information: The Atheros WLAN in this system is unused (not
 associated) but both ath0 and wlan0 were UP at the time of the panic.

 Initial testing shows the system to be stable with both wlan0 and ath0
 set to down after boot. But still, the timeout should not panic the
 kernel, if WLAN is active but not fully configured (e.g. no SSID).

 Any ideas?

 Best regards, STefan
 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org