RE: [REGRESSION, bisect] mesh: SAE connection causes kernel crash

2017-01-11 Thread Cedric Izoard
> > When calling drv_tx() the headroom is not big enough for the driver.
> 
> Ok.
> 
> > >
> > > Maybe we're adding something else to this skb?
> > >
> > > I can't find anything in the ath9k_htc driver that's adding more
> > > than
> > > 23 bytes (it's advertising 24) but clearly the last 8 bytes here are
> > > failing:
> > >
> > > > [   83.200346] skbuff: skb_under_panic: text:a034c028
> > > > len:154
> > > > put:8 head:880213422e00 data:880213422dfa tail:0x94
> > > > end:0xc0
> > > > dev:
> > >
> > > Maybe mac80211 is putting something else? It'd have to be
> >
> > yes mac80211 is adding the security header.
> > headroom asked to skb_copy_expand should also take sdata-
> > >encrypt_headroom into account.
> 
> I suspected that, since it was about the only place that was adding anything,
> but I couldn't test it :)
> 
> Can you send a patch?

Sure, I will send a patch.

cedric


RE: [REGRESSION, bisect] mesh: SAE connection causes kernel crash

2017-01-11 Thread Cedric Izoard
> > Here is the stack trace I get:
> > I added a trace before calling skb_copy_expand to get the headroom of
> > the buffer before the copy and the headroom asked by the driver.
> >
> > [   83.200261] MESH fwd: skb_headroom=154, needed headroom=24
> 
> Could you also add a similar trace just before calling drv_tx()?

When calling drv_tx() the headroom is not big enough for the driver.

> 
> Maybe we're adding something else to this skb?
> 
> I can't find anything in the ath9k_htc driver that's adding more than
> 23 bytes (it's advertising 24) but clearly the last 8 bytes here are
> failing:
> 
> > [   83.200346] skbuff: skb_under_panic: text:a034c028 len:154
> > put:8 head:880213422e00 data:880213422dfa tail:0x94 end:0xc0
> > dev:
> 
> Maybe mac80211 is putting something else? It'd have to be

yes mac80211 is adding the security header.
headroom asked to skb_copy_expand should also take sdata->encrypt_headroom into 
account.

cedric


Re: [REGRESSION, bisect] mesh: SAE connection causes kernel crash

2017-01-11 Thread Johannes Berg

> When calling drv_tx() the headroom is not big enough for the driver.

Ok.

> > 
> > Maybe we're adding something else to this skb?
> > 
> > I can't find anything in the ath9k_htc driver that's adding more
> > than
> > 23 bytes (it's advertising 24) but clearly the last 8 bytes here
> > are
> > failing:
> > 
> > > [   83.200346] skbuff: skb_under_panic: text:a034c028
> > > len:154
> > > put:8 head:880213422e00 data:880213422dfa tail:0x94
> > > end:0xc0
> > > dev:
> > 
> > Maybe mac80211 is putting something else? It'd have to be
> 
> yes mac80211 is adding the security header.
> headroom asked to skb_copy_expand should also take sdata-
> >encrypt_headroom into account.

I suspected that, since it was about the only place that was adding
anything, but I couldn't test it :)

Can you send a patch?

johannes


Re: [REGRESSION, bisect] mesh: SAE connection causes kernel crash

2017-01-11 Thread Johannes Berg

> I made a quick test with dongle using ath9k_htc driver and I indeed
> reproduce the issue.

Thanks.

> Here is the stack trace I get:
> I added a trace before calling skb_copy_expand to get the headroom of
> the buffer before the copy and the headroom asked by the driver.
> 
> [   83.200261] MESH fwd: skb_headroom=154, needed headroom=24

Could you also add a similar trace just before calling drv_tx()?

Maybe we're adding something else to this skb?

I can't find anything in the ath9k_htc driver that's adding more than
23 bytes (it's advertising 24) but clearly the last 8 bytes here are
failing:

> [   83.200346] skbuff: skb_under_panic: text:a034c028 len:154
> put:8 head:880213422e00 data:880213422dfa tail:0x94 end:0xc0
> dev:

Maybe mac80211 is putting something else? It'd have to be 

johannes


RE: [REGRESSION, bisect] mesh: SAE connection causes kernel crash

2017-01-11 Thread Cedric Izoard
> On 2017年01月11日 20:01, Johannes Berg wrote:
> > Sure, ssh won't - I was thinking of netconsole:
> > https://www.kernel.org/doc/Documentation/networking/netconsole.txt
> 
> Oh, I see. Thanks, I will try.
> 
> Masashi Honma.
Hi,

I made a quick test with dongle using ath9k_htc driver and I indeed reproduce 
the issue.
Here is the stack trace I get:
I added a trace before calling skb_copy_expand to get the headroom of the 
buffer before the copy and the headroom asked by the driver.

[   83.200261] MESH fwd: skb_headroom=154, needed headroom=24
[   83.200346] skbuff: skb_under_panic: text:a034c028 len:154 put:8 
head:880213422e00 data:880213422dfa tail:0x94 end:0xc0 dev:
[   83.200359] [ cut here ]
[   83.200362] kernel BUG at ../net/core/skbuff.c:105!
[   83.200364] invalid opcode:  [#1] SMP
[   83.200366] Modules linked in: ath9k_htc ath9k_common ath9k_hw ath10k_pci 
ath10k_core mac80211 ath cfg80211 x86_pkg_temp_thermal
[   83.200377] CPU: 4 PID: 29 Comm: ksoftirqd/4 Not tainted 4.9.0+ #3
[   83.200379] Hardware name: Dell Inc. OptiPlex 990/06D7TR, BIOS A19 08/26/2015
[   83.200381] task: 880223e55cc0 task.stack: c9db8000
[   83.200383] RIP: 0010:[]  [] 
skb_panic+0x5c/0x60
[   83.200391] RSP: 0018:c9dbbba0  EFLAGS: 00010286
[   83.200393] RAX: 0086 RBX: 0006 RCX: 
[   83.200398] RDX: 8802253126d8 RSI: 88022530cb28 RDI: 88022530cb28
[   83.200400] RBP: c9dbbbc0 R08: 00030e9a R09: 0005
[   83.200401] R10:  R11: 02c8 R12: 880222d0a000
[   83.200403] R13: 9200 R14: 880221ba7f00 R15: 0010
[   83.200406] FS:  () GS:88022530() 
knlGS:
[   83.200408] CS:  0010 DS:  ES:  CR0: 80050033
[   83.200410] CR2: 7fd414010c98 CR3: 00022247d000 CR4: 000406e0
[   83.200411] Stack:
[   83.200413]  880213422dfa 0094 00c0 
81c308d9
[   83.200417]  c9dbbbd0 817019b7 c9dbbc00 
a034c028
[   83.200420]  880221ba7f00 8802226d15a0  
c9dbbcb8
[   83.200423] Call Trace:
[   83.200429]  [] skb_push+0x37/0x40
[   83.200435]  [] htc_issue_send.constprop.2+0x28/0x60 
[ath9k_htc]
[   83.200441]  [] htc_send+0x11/0x20 [ath9k_htc]
[   83.200445]  [] ath9k_htc_tx_start+0xd7/0x2a0 [ath9k_htc]
[   83.200450]  [] ath9k_htc_tx+0xa8/0xd0 [ath9k_htc]
[   83.200471]  [] ieee80211_tx_frags+0x137/0x1f0 [mac80211]
[   83.200489]  [] __ieee80211_tx+0x7c/0x180 [mac80211]
[   83.200506]  [] ieee80211_tx+0xe5/0x110 [mac80211]
[   83.200518]  [] ieee80211_tx_pending+0x8f/0x1f0 [mac80211]
[   83.200522]  [] ? pick_next_task_fair+0x406/0x470
[   83.200528]  [] tasklet_action+0xda/0xf0
[   83.200532]  [] __do_softirq+0xe2/0x270
[   83.200536]  [] run_ksoftirqd+0x17/0x30
[   83.200540]  [] smpboot_thread_fn+0x105/0x160
[   83.200543]  [] ? sort_range+0x20/0x20
[   83.200547]  [] kthread+0xc5/0xe0
[   83.200550]  [] ? kthread_park+0x60/0x60
[   83.200554]  [] ret_from_fork+0x22/0x30
[   83.200556] Code: c4 00 00 00 48 89 44 24 10 8b 87 c0 00 00 00 48 89 44 24 
08 48 8b 87 d0 00 00 00 48 c7 c7 28 57 c3 81 48 89 04 24 e8 15 78 a2 ff <0f> 0b 
66 90 55 48 89 e5 41 57 41 56 41 55 41 54 53 65 4c 8b 34
[   83.200596] RIP  [] skb_panic+0x5c/0x60
[   83.200602]  RSP 

cedric


Re: [REGRESSION, bisect] mesh: SAE connection causes kernel crash

2017-01-11 Thread Masashi Honma

On 2017年01月11日 20:01, Johannes Berg wrote:

Sure, ssh won't - I was thinking of netconsole:
https://www.kernel.org/doc/Documentation/networking/netconsole.txt


Oh, I see. Thanks, I will try.

Masashi Honma.



Re: [REGRESSION, bisect] mesh: SAE connection causes kernel crash

2017-01-11 Thread Johannes Berg
On Wed, 2017-01-11 at 19:36 +0900, Masashi Honma wrote:
> On 2017年01月11日 19:00, Johannes Berg wrote:
> > Nevertheless, I don't have hardware to try to reproduce it, and I
> > can't
> > see any such issues (even with real forwarding, I even just wrote a
> > wpa_s test for that) in hwsim.
> > 
> > Even a photo of the crash on the VT would help. Or maybe you can
> > set up
> > netconsole on the wired interface?
> 
> Thanks but SSH console via wired interface and laptop display does
> not show any log...

Sure, ssh won't - I was thinking of netconsole:
https://www.kernel.org/doc/Documentation/networking/netconsole.txt

johannes


Re: [REGRESSION, bisect] mesh: SAE connection causes kernel crash

2017-01-11 Thread Masashi Honma

On 2017年01月11日 19:00, Johannes Berg wrote:

Nevertheless, I don't have hardware to try to reproduce it, and I can't
see any such issues (even with real forwarding, I even just wrote a
wpa_s test for that) in hwsim.

Even a photo of the crash on the VT would help. Or maybe you can set up
netconsole on the wired interface?


Thanks but SSH console via wired interface and laptop display does not 
show any log...


Masashi Honma.


Re: [REGRESSION, bisect] mesh: SAE connection causes kernel crash

2017-01-11 Thread Johannes Berg

> I will call the mesh peers "STA A" and "STA B".
> 
> Both STA has one physical wireless I/F and wired I/F.

Ok.

[snip configuration]

> Then STA A or STA B crashes, not both.

Nevertheless, I don't have hardware to try to reproduce it, and I can't
see any such issues (even with real forwarding, I even just wrote a
wpa_s test for that) in hwsim.

Even a photo of the crash on the VT would help. Or maybe you can set up
netconsole on the wired interface?

johannes


Re: [REGRESSION, bisect] mesh: SAE connection causes kernel crash

2017-01-11 Thread Masashi Honma

On 2017年01月11日 18:00, Johannes Berg wrote:

Ok, that's strange, but maybe there's a reason.

Can you extract *any* information whatsoever? Like maybe if you switch
to a VT console before running into the crash? I don't have any
hardware to run this on, and hwsim doesn't have any issues.


I will call the mesh peers "STA A" and "STA B".

Both STA has one physical wireless I/F and wired I/F.
I have connected to both with SSH via wired I/F and started
wpa_supplicant with this command for both.
	sudo ./hostap/wpa_supplicant/wpa_supplicant -i  -D nl80211 -c 
mesh_sae.conf


STA A's mesh_sae.conf is this.

--
ctrl_interface=/var/run/wpa_supplicant
ap_scan=1
user_mpm=1
update_config=0

network={
ssid="mesh0"
key_mgmt=SAE
mode=5
frequency=2412
psk="01234567"
}
--

STA B's mesh_sae.conf is this. The difference is "no_auto_peer=1".

--
ctrl_interface=/var/run/wpa_supplicant
ap_scan=1
user_mpm=1
update_config=0

network={
ssid="mesh0"
key_mgmt=SAE
mode=5
frequency=2412
psk="01234567"
no_auto_peer=1
}
--

Booting the wpa_supplicant finishes successfully.

After the successfull peering process, I could see
MESH-PEER-CONNECTED
message on both side.

Then STA A or STA B crashes, not both.

Masashi Honma.


Re: [REGRESSION, bisect] mesh: SAE connection causes kernel crash

2017-01-11 Thread Johannes Berg
On Wed, 2017-01-11 at 17:50 +0900, Masashi Honma wrote:
> On 2017年01月11日 17:02, Johannes Berg wrote:
> > I don't think this makes sense - if you only have two peers then
> > you
> > shouldn't even run into forwarding code paths?
> > 
> > johannes
> 
> Though it looks odd, the code has run into forwarding code path even 
> though peer to peer mesh connection.
> 
>   fwd_skb = skb_copy(skb, GFP_ATOMIC);
> 
> I checked it with printk().
> 
> # I know printk() should not be used in the context, just for
> checking.

Ok, that's strange, but maybe there's a reason.

Can you extract *any* information whatsoever? Like maybe if you switch
to a VT console before running into the crash? I don't have any
hardware to run this on, and hwsim doesn't have any issues.

johannes


Re: [REGRESSION, bisect] mesh: SAE connection causes kernel crash

2017-01-11 Thread Masashi Honma

On 2017年01月11日 17:02, Johannes Berg wrote:

I don't think this makes sense - if you only have two peers then you
shouldn't even run into forwarding code paths?

johannes


Though it looks odd, the code has run into forwarding code path even 
though peer to peer mesh connection.


fwd_skb = skb_copy(skb, GFP_ATOMIC);

I checked it with printk().

# I know printk() should not be used in the context, just for checking.

Masashi Honma.


Re: [REGRESSION, bisect] mesh: SAE connection causes kernel crash

2017-01-11 Thread Johannes Berg
On Wed, 2017-01-11 at 08:35 +0900, Masashi Honma wrote:
> I have encountered kernel crash when I have used mesh SAE
> connection  
> with ath9k_htc device (Sony UWA-BR100). I have tried to connect 2
> peers  
> to each other, then only one peer crashes.
> 
> By bisect, this commit looks causes this issue.
> 
> commit d8da0b5d64d58f7775a94bcf12dda50f13a76f22
> Author: Cedric Izoard 
> Date:   Wed Dec 7 09:59:00 2016 +
> 
>  mac80211: Ensure enough headroom when forwarding mesh pkt

I don't think this makes sense - if you only have two peers then you
shouldn't even run into forwarding code paths?

johannes