Re: [Intel-wired-lan] [PATCH v1 1/4] igb: Remove unnecessary include of

2018-07-25 Thread Alexander Duyck
On Wed, Jul 25, 2018 at 12:52 PM, Bjorn Helgaas  wrote:
> From: Bjorn Helgaas 
>
> The igb driver doesn't need anything provided by pci-aspm.h, so remove
> the unnecessary include of it.
>
> Signed-off-by: Bjorn Helgaas 

Looks good to me.

Acked-by: Alexander Duyck 

> ---
>  drivers/net/ethernet/intel/igb/igb_main.c |1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c 
> b/drivers/net/ethernet/intel/igb/igb_main.c
> index f707709969ac..c77fda05f683 100644
> --- a/drivers/net/ethernet/intel/igb/igb_main.c
> +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> @@ -22,7 +22,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
>
> ___
> Intel-wired-lan mailing list
> intel-wired-...@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan


Re: How to debug DMAR errors?

2017-04-14 Thread Alexander Duyck
On Fri, Apr 14, 2017 at 9:19 AM, Ben Greear <gree...@candelatech.com> wrote:
>
>
> On 04/14/2017 08:45 AM, Alexander Duyck wrote:
>>
>> On Thu, Apr 13, 2017 at 11:12 AM, Ben Greear <gree...@candelatech.com>
>> wrote:
>>>
>>> Hello,
>>>
>>> I have been seeing a regular occurrence of DMAR errors, looking something
>>> like this when testing my ath10k driver/firmware under some specific
>>> loads
>>> (maximum receive of 512 byte frames in AP mode):
>>>
>>> DMAR: DRHD: handling fault status reg 3
>>> DMAR: [DMA Read] Request device [05:00.0] fault addr fd99f000 [fault
>>> reason
>>> 06] PTE Read access is not set
>>> ath10k_pci :05:00.0: firmware crashed! (uuid
>>> 594b1393-ae35-42b5-9dec-74ff0c6791ff)
>>>
>>> So, I am wondering if there is any way I can get more information about
>>> what
>>> this fd99f000 address
>>> is?
>>>
>>> Once this problem hits, the entire OS locks hard (not even sysrq-boot
>>> will
>>> do anything),
>>> so I guess I would need the DMAR logic to print out more info on that
>>> address somehow.
>>>
>>> Thanks,
>>> Ben
>>
>>
>> There isn't much more info to give you. The problem is that the device
>> at 5:00.0 attempted to read at fd99f000 even though it didn't have
>> permissions. In response this should trigger a PCI Master Abort
>> message to that function. It looks like the firmware for the device
>> doesn't handle that and so that is likely why things got hung.
>>
>> Really you would need to interrogate the ath10k_pci to see if there
>> is/was a mapping somewhere for that address and what it was supposed
>> to be used for.
>
>
> I'm working on a hook in DMAR logic to call into ath10k_pci when the
> error is seen, so the ath10k can dump debug info, including recent DMA
> addresses.
>
> My code is an awful hack so far, but if someone could add a clean way to
> register
> DMAR error callbacks, I think that would be very welcome.  It might could
> tie into
> automated dma map/unmap debugging logic, and at the least, someone could
> write custom debugging callbacks
> for the driver(s) in question.
>
> Thanks,
> Ben
>

You might look at coding up something to add pci_error_handlers for
the pci_driver in the ath10k_pci driver. The PCI Master Abort should
trigger an error that you could then capture in the driver and handle
at least dumping it via your own implementation of the error handlers.
If nothing else I suspect there are probably some sort of descriptor
rings you could probably dump. I'm suspecting this is some sort of Tx
issue since the problem was a read fault, but I suppose there are
other paths in the driver that might trigger DMA read requests.

- Alex


Re: How to debug DMAR errors?

2017-04-14 Thread Alexander Duyck
On Thu, Apr 13, 2017 at 11:12 AM, Ben Greear  wrote:
> Hello,
>
> I have been seeing a regular occurrence of DMAR errors, looking something
> like this when testing my ath10k driver/firmware under some specific loads
> (maximum receive of 512 byte frames in AP mode):
>
> DMAR: DRHD: handling fault status reg 3
> DMAR: [DMA Read] Request device [05:00.0] fault addr fd99f000 [fault reason
> 06] PTE Read access is not set
> ath10k_pci :05:00.0: firmware crashed! (uuid
> 594b1393-ae35-42b5-9dec-74ff0c6791ff)
>
> So, I am wondering if there is any way I can get more information about what
> this fd99f000 address
> is?
>
> Once this problem hits, the entire OS locks hard (not even sysrq-boot will
> do anything),
> so I guess I would need the DMAR logic to print out more info on that
> address somehow.
>
> Thanks,
> Ben

There isn't much more info to give you. The problem is that the device
at 5:00.0 attempted to read at fd99f000 even though it didn't have
permissions. In response this should trigger a PCI Master Abort
message to that function. It looks like the firmware for the device
doesn't handle that and so that is likely why things got hung.

Really you would need to interrogate the ath10k_pci to see if there
is/was a mapping somewhere for that address and what it was supposed
to be used for.

- Alex


Re: [PATCH v2] rtlwifi: pci: use dev_kfree_skb_irq instead of kfree_skb in rtl_pci_reset_trx_ring

2016-05-06 Thread Alexander Duyck
On Fri, May 6, 2016 at 11:01 AM, Larry Finger <larry.fin...@lwfinger.net> wrote:
> On 05/06/2016 12:13 PM, Alexander Duyck wrote:
>>
>> On Fri, May 6, 2016 at 9:33 AM, Wang YanQing <udkni...@gmail.com> wrote:
>>>
>>> We can't use kfree_skb in irq disable context, because spin_lock_irqsave
>>> make sure we are always in irq disable context, use dev_kfree_skb_irq
>>> instead of kfree_skb is better than dev_kfree_skb_any.
>>>
>>> This patch fix below kernel warning:
>>> [ 7612.095528] [ cut here ]
>>> [ 7612.095546] WARNING: CPU: 3 PID: 4460 at kernel/softirq.c:150
>>> __local_bh_enable_ip+0x58/0x80()
>>> [ 7612.095550] Modules linked in: rtl8723be x86_pkg_temp_thermal
>>> btcoexist rtl_pci rtlwifi rtl8723_common
>>> [ 7612.095567] CPU: 3 PID: 4460 Comm: ifconfig Tainted: GW
>>> 4.4.0+ #4
>>> [ 7612.095570] Hardware name: LENOVO 20DFA04FCD/20DFA04FCD, BIOS J5ET48WW
>>> (1.19 ) 08/27/2015
>>> [ 7612.095574]    da37fc70 c12ce7c5  da37fca0
>>> c104cc59 c19d4454
>>> [ 7612.095584]  0003 116c c19d4784 0096 c10508a8 c10508a8
>>> 0200 c1b42400
>>> [ 7612.095594]  f29be780 da37fcb0 c104ccad 0009  da37fcbc
>>> c10508a8 f21f08b8
>>> [ 7612.095604] Call Trace:
>>> [ 7612.095614]  [] dump_stack+0x41/0x5c
>>> [ 7612.095620]  [] warn_slowpath_common+0x89/0xc0
>>> [ 7612.095628]  [] ? __local_bh_enable_ip+0x58/0x80
>>> [ 7612.095634]  [] ? __local_bh_enable_ip+0x58/0x80
>>> [ 7612.095640]  [] warn_slowpath_null+0x1d/0x20
>>> [ 7612.095646]  [] __local_bh_enable_ip+0x58/0x80
>>> [ 7612.095653]  [] destroy_conntrack+0x64/0xa0
>>> [ 7612.095660]  [] nf_conntrack_destroy+0xf/0x20
>>> [ 7612.095665]  [] skb_release_head_state+0x55/0xa0
>>> [ 7612.095670]  [] skb_release_all+0xb/0x20
>>> [ 7612.095674]  [] __kfree_skb+0xb/0x60
>>> [ 7612.095679]  [] kfree_skb+0x30/0x70
>>> [ 7612.095686]  [] ? rtl_pci_reset_trx_ring+0x22d/0x370
>>> [rtl_pci]
>>> [ 7612.095692]  [] rtl_pci_reset_trx_ring+0x22d/0x370 [rtl_pci]
>>> [ 7612.095698]  [] rtl_pci_start+0x19/0x190 [rtl_pci]
>>> [ 7612.095705]  [] rtl_op_start+0x56/0x90 [rtlwifi]
>>> [ 7612.095712]  [] drv_start+0x36/0xc0
>>> [ 7612.095717]  [] ieee80211_do_open+0x2d3/0x890
>>> [ 7612.095725]  [] ? call_netdevice_notifiers_info+0x2e/0x60
>>> [ 7612.095730]  [] ieee80211_open+0x4d/0x50
>>> [ 7612.095736]  [] __dev_open+0xa3/0x130
>>> [ 7612.095742]  [] ? _raw_spin_unlock_bh+0x13/0x20
>>> [ 7612.095748]  [] __dev_change_flags+0x89/0x140
>>> [ 7612.095753]  [] ? selinux_capable+0xd/0x10
>>> [ 7612.095759]  [] dev_change_flags+0x29/0x60
>>> [ 7612.095765]  [] devinet_ioctl+0x553/0x670
>>> [ 7612.095772]  [] ? _copy_to_user+0x28/0x40
>>> [ 7612.095777]  [] inet_ioctl+0x85/0xb0
>>> [ 7612.095783]  [] sock_ioctl+0x67/0x260
>>> [ 7612.095788]  [] ? sock_fasync+0x80/0x80
>>> [ 7612.095795]  [] do_vfs_ioctl+0x6b/0x550
>>> [ 7612.095800]  [] ? selinux_file_ioctl+0x102/0x1e0
>>> [ 7612.095807]  [] ? timekeeping_suspend+0x294/0x320
>>> [ 7612.095813]  [] ? __hrtimer_run_queues+0x14a/0x210
>>> [ 7612.095820]  [] ? security_file_ioctl+0x34/0x50
>>> [ 7612.095827]  [] SyS_ioctl+0x70/0x80
>>> [ 7612.095832]  [] do_fast_syscall_32+0x84/0x120
>>> [ 7612.095839]  [] sysenter_past_esp+0x36/0x55
>>> [ 7612.095844] ---[ end trace 97e9c637a20e8348 ]---
>>>
>>> Signed-off-by: Wang YanQing <udkni...@gmail.com>
>>> Cc: Stable <sta...@vger.kernel.org>
>>> ---
>>>   Changes:
>>>   v1-v2:
>>>   1: add a Cc to stable.
>>>
>>>   drivers/net/wireless/realtek/rtlwifi/pci.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/wireless/realtek/rtlwifi/pci.c
>>> b/drivers/net/wireless/realtek/rtlwifi/pci.c
>>> index 1ac41b8..99a3a03 100644
>>> --- a/drivers/net/wireless/realtek/rtlwifi/pci.c
>>> +++ b/drivers/net/wireless/realtek/rtlwifi/pci.c
>>> @@ -1572,7 +1572,7 @@ int rtl_pci_reset_trx_ring(struct ieee80211_hw *hw)
>>>   true,
>>>
>>> HW_DESC_TXBUFF_ADDR),
>>>   skb->len,
>>> PCI_DMA_TODEVICE);
>>> -   kfree_skb(skb);
>>> +   dev_kfree_sk

Re: [PATCH v2] rtlwifi: pci: use dev_kfree_skb_irq instead of kfree_skb in rtl_pci_reset_trx_ring

2016-05-06 Thread Alexander Duyck
On Fri, May 6, 2016 at 9:33 AM, Wang YanQing  wrote:
> We can't use kfree_skb in irq disable context, because spin_lock_irqsave
> make sure we are always in irq disable context, use dev_kfree_skb_irq
> instead of kfree_skb is better than dev_kfree_skb_any.
>
> This patch fix below kernel warning:
> [ 7612.095528] [ cut here ]
> [ 7612.095546] WARNING: CPU: 3 PID: 4460 at kernel/softirq.c:150 
> __local_bh_enable_ip+0x58/0x80()
> [ 7612.095550] Modules linked in: rtl8723be x86_pkg_temp_thermal btcoexist 
> rtl_pci rtlwifi rtl8723_common
> [ 7612.095567] CPU: 3 PID: 4460 Comm: ifconfig Tainted: GW   
> 4.4.0+ #4
> [ 7612.095570] Hardware name: LENOVO 20DFA04FCD/20DFA04FCD, BIOS J5ET48WW 
> (1.19 ) 08/27/2015
> [ 7612.095574]    da37fc70 c12ce7c5  da37fca0 
> c104cc59 c19d4454
> [ 7612.095584]  0003 116c c19d4784 0096 c10508a8 c10508a8 
> 0200 c1b42400
> [ 7612.095594]  f29be780 da37fcb0 c104ccad 0009  da37fcbc 
> c10508a8 f21f08b8
> [ 7612.095604] Call Trace:
> [ 7612.095614]  [] dump_stack+0x41/0x5c
> [ 7612.095620]  [] warn_slowpath_common+0x89/0xc0
> [ 7612.095628]  [] ? __local_bh_enable_ip+0x58/0x80
> [ 7612.095634]  [] ? __local_bh_enable_ip+0x58/0x80
> [ 7612.095640]  [] warn_slowpath_null+0x1d/0x20
> [ 7612.095646]  [] __local_bh_enable_ip+0x58/0x80
> [ 7612.095653]  [] destroy_conntrack+0x64/0xa0
> [ 7612.095660]  [] nf_conntrack_destroy+0xf/0x20
> [ 7612.095665]  [] skb_release_head_state+0x55/0xa0
> [ 7612.095670]  [] skb_release_all+0xb/0x20
> [ 7612.095674]  [] __kfree_skb+0xb/0x60
> [ 7612.095679]  [] kfree_skb+0x30/0x70
> [ 7612.095686]  [] ? rtl_pci_reset_trx_ring+0x22d/0x370 [rtl_pci]
> [ 7612.095692]  [] rtl_pci_reset_trx_ring+0x22d/0x370 [rtl_pci]
> [ 7612.095698]  [] rtl_pci_start+0x19/0x190 [rtl_pci]
> [ 7612.095705]  [] rtl_op_start+0x56/0x90 [rtlwifi]
> [ 7612.095712]  [] drv_start+0x36/0xc0
> [ 7612.095717]  [] ieee80211_do_open+0x2d3/0x890
> [ 7612.095725]  [] ? call_netdevice_notifiers_info+0x2e/0x60
> [ 7612.095730]  [] ieee80211_open+0x4d/0x50
> [ 7612.095736]  [] __dev_open+0xa3/0x130
> [ 7612.095742]  [] ? _raw_spin_unlock_bh+0x13/0x20
> [ 7612.095748]  [] __dev_change_flags+0x89/0x140
> [ 7612.095753]  [] ? selinux_capable+0xd/0x10
> [ 7612.095759]  [] dev_change_flags+0x29/0x60
> [ 7612.095765]  [] devinet_ioctl+0x553/0x670
> [ 7612.095772]  [] ? _copy_to_user+0x28/0x40
> [ 7612.095777]  [] inet_ioctl+0x85/0xb0
> [ 7612.095783]  [] sock_ioctl+0x67/0x260
> [ 7612.095788]  [] ? sock_fasync+0x80/0x80
> [ 7612.095795]  [] do_vfs_ioctl+0x6b/0x550
> [ 7612.095800]  [] ? selinux_file_ioctl+0x102/0x1e0
> [ 7612.095807]  [] ? timekeeping_suspend+0x294/0x320
> [ 7612.095813]  [] ? __hrtimer_run_queues+0x14a/0x210
> [ 7612.095820]  [] ? security_file_ioctl+0x34/0x50
> [ 7612.095827]  [] SyS_ioctl+0x70/0x80
> [ 7612.095832]  [] do_fast_syscall_32+0x84/0x120
> [ 7612.095839]  [] sysenter_past_esp+0x36/0x55
> [ 7612.095844] ---[ end trace 97e9c637a20e8348 ]---
>
> Signed-off-by: Wang YanQing 
> Cc: Stable 
> ---
>  Changes:
>  v1-v2:
>  1: add a Cc to stable.
>
>  drivers/net/wireless/realtek/rtlwifi/pci.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/wireless/realtek/rtlwifi/pci.c 
> b/drivers/net/wireless/realtek/rtlwifi/pci.c
> index 1ac41b8..99a3a03 100644
> --- a/drivers/net/wireless/realtek/rtlwifi/pci.c
> +++ b/drivers/net/wireless/realtek/rtlwifi/pci.c
> @@ -1572,7 +1572,7 @@ int rtl_pci_reset_trx_ring(struct ieee80211_hw *hw)
>  true,
>  HW_DESC_TXBUFF_ADDR),
>  skb->len, PCI_DMA_TODEVICE);
> -   kfree_skb(skb);
> +   dev_kfree_skb_irq(skb);
> ring->idx = (ring->idx + 1) % ring->entries;
> }
> ring->idx = 0;

Is this always called in IRQ context?  You might be better off using
dev_kfree_skb_any instead if this is something that can be called from
net_device_ops since that way you avoid having to call into the Tx
softirq cleanup routine to free the buffers later unless you really
need it.

- Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 2/2] mac80211: Resolve sk_refcnt/sk_wmem_alloc issue in wifi ack path

2014-09-11 Thread Alexander Duyck
On 09/11/2014 02:38 AM, Arend van Spriel wrote:
 On 09/11/14 09:06, Johannes Berg wrote:
 On Wed, 2014-09-10 at 18:05 -0400, Alexander Duyck wrote:
 There is a possible issue with the use, or lack thereof of sk_refcnt and
 sk_wmem_alloc in the wifi ack status functionality.

 Specifically if a socket were to request acknowledgements, and the
 socket
 were to have sk_refcnt drop to 0 resulting in it waiting on
 sk_wmem_alloc
 to reach 0 it would be possible to have sock_queue_err_skb orphan the
 last
 buffer, resulting in __sk_free being called on the socket.  After
 this the
 buffer is enqueued on sk_error_queue, however the queue has already been
 flushed resulting in at least a memory leak, if not a data corruption.

 Oh. Thanks :-)
 
 Hi Alexander,
 
 So why is this only an issue in wifi ack path. The sock_queue_err_skb()
 does not mention the caller should hold a sock reference. This seems
 entirely an issue of the sock_queue_err_skb() function itself so why not
 do sk_hold/sk_put within that function. Does it impose too much overhead?
 
 Regards,
 Arend

I considered it but there are a number of cases where this is not an issue.

For example in the tx timestamping path there is the software timestamp
case where the buffer is cloned and the clone is queued immediately onto
the sk_error_queue.  In that case we still have a reference in the other
skb that is maintaining the socket.

So I thought it best to just address the cases where I know this could
be a problem.  I had already addressed it in the timestamping for
hardware timestamps where we are doing something similar.  So I thought
it would make sense to cover the other case that should have the same
problems.

Thanks,

Alex
--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 2/2] mac80211: Resolve sk_refcnt/sk_wmem_alloc issue in wifi ack path

2014-09-11 Thread Alexander Duyck
On 09/11/2014 12:06 AM, Johannes Berg wrote:
 On Wed, 2014-09-10 at 18:05 -0400, Alexander Duyck wrote:
 There is a possible issue with the use, or lack thereof of sk_refcnt and
 sk_wmem_alloc in the wifi ack status functionality.

 Specifically if a socket were to request acknowledgements, and the socket
 were to have sk_refcnt drop to 0 resulting in it waiting on sk_wmem_alloc
 to reach 0 it would be possible to have sock_queue_err_skb orphan the last
 buffer, resulting in __sk_free being called on the socket.  After this the
 buffer is enqueued on sk_error_queue, however the queue has already been
 flushed resulting in at least a memory leak, if not a data corruption.
 
 Oh. Thanks :-)
 
 +/* take a reference to prevent skb_orphan() from freeing the socket */
 +sock_hold(sk);
 +
  err = sock_queue_err_skb(sk, skb);
  if (err)
  kfree_skb(skb);
 +
 +sock_put(sk);
  }
  EXPORT_SYMBOL_GPL(skb_complete_wifi_ack);
 
 Here I'm not sure it matters *for this function*? Wouldn't it be freed
 then in sock_put(), which has the same net effect on this function
 overall? It doesn't use it after sock_queue_err_skb().

The significant piece is that we are calling sock_put *after*.  So if we
are dropping the last reference the buffer is already in the
sk_error_queue and will be purged when __sk_free is called.

 Seems like maybe this should be in sock_queue_err_skb() itself, since it
 does the orphaning first and then looks at the socket. Or the
 documentation for that function should state that it has to be held, but
 there are plenty of callers?

The problem is there are a number of cases where the sock_hold/put are
not needed.  For example, if we were to clone the skb and immediately
send the clone up the sk_error_queue then we don't need it.  We only
need it if there is a risk that orphaning the buffer sent could
potentially result in the destructor calling __sk_free.

  spin_lock_irqsave(local-ack_status_lock, flags);
 -id = idr_alloc(local-ack_status_frames, orig_skb,
 +id = idr_alloc(local-ack_status_frames, ack_skb,
 1, 0x1, GFP_ATOMIC);
  spin_unlock_irqrestore(local-ack_status_lock, flags);
  
  if (id = 0) {
  info_id = id;
  info_flags |= IEEE80211_TX_CTL_REQ_TX_STATUS;
 -} else if (skb_shared(skb)) {
 -kfree_skb(orig_skb);
  } else {
 -kfree_skb(skb);
 -skb = orig_skb;
 +kfree_skb(ack_skb);
  }
 
 So you're removing this part, but can't we really not reuse the clone_sk
 copy? The difference is that it's charged, but that's fine for the
 purposes here, no? Or am I misunderstanding that?
 
 johannes

The copy being held cannot really be used for transmit.  The problem is
that it is holding the wrong kind of reference.

The problem lies in the order things are released.  The sock_put
function will dec_and_test sk_refcnt, once it reaches 0 it will do a
dec_and_test on sk_wmem_alloc to see if it should call __sk_free.  Until
that reaches 0 sk_wmem_alloc cannot reach 0.  Once either of these drops
to 0 we cannot bring the value back up from there.  So if I were to
transmit the clone then it could let the sk_refcnt drop to 0 in which
case any calls to sock_hold are invalid.

I would need to somehow hold the reference based on sk_wmem_alloc if we
want to transmit the clone.  Many of the hardware timestamping drivers
seem to just clone the original skb, queue that clone onto the
sk_error_queue, and then free the original after completing the call.  I
suppose we could change it to something like that, but you are still
looking at possibly 2 clones in that case anyway.

Thanks,

Alex
--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next] mac80211: Check correct skb for shared states before freeing original

2014-09-10 Thread Alexander Duyck
The code for cloning the skb for an acknowledgement was checking to see if
the cloned skb was shared and if it was it was then freeing the original
skb.  Since a clone should never really be shared I suspect that the
intention was to avoid freeing the clone if the original was shared.  As
such I am updating the code so that if the original is shared we free the
original and use the clone.  This avoids unnecessary work in the next
section where we would be cloning the skb if the original is shared.

Signed-off-by: Alexander Duyck alexander.h.du...@intel.com
---
 net/mac80211/tx.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index 925c39f..e527cd3 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -2087,7 +2087,7 @@ netdev_tx_t ieee80211_subif_start_xmit(struct sk_buff 
*skb,
if (id = 0) {
info_id = id;
info_flags |= IEEE80211_TX_CTL_REQ_TX_STATUS;
-   } else if (skb_shared(skb)) {
+   } else if (skb_shared(orig_skb)) {
kfree_skb(orig_skb);
} else {
kfree_skb(skb);

--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] mac80211: Check correct skb for shared states before freeing original

2014-09-10 Thread Alexander Duyck
On 09/10/2014 01:33 PM, Johannes Berg wrote:
 On Wed, 2014-09-10 at 16:06 -0400, Alexander Duyck wrote:
 The code for cloning the skb for an acknowledgement was checking to see if
 the cloned skb was shared and if it was it was then freeing the original
 skb.  Since a clone should never really be shared I suspect that the
 intention was to avoid freeing the clone if the original was shared.  As
 such I am updating the code so that if the original is shared we free the
 original and use the clone.  This avoids unnecessary work in the next
 section where we would be cloning the skb if the original is shared.
 
 Thanks, yeah, I admit that this is clearly fishy.
 
 @@ -2087,7 +2087,7 @@ netdev_tx_t ieee80211_subif_start_xmit(struct sk_buff 
 *skb,
  if (id = 0) {
  info_id = id;
  info_flags |= IEEE80211_TX_CTL_REQ_TX_STATUS;
 
 Luckily, we practically always go into this path.
 
 -} else if (skb_shared(skb)) {
 +} else if (skb_shared(orig_skb)) {
  kfree_skb(orig_skb);
  } else {
  kfree_skb(skb);
 
 We have a clone already so we could just remove the whole else if I
 think, but I'm guessing my intent was to keep it accounted to the socket
 where possible rather than freeing the original in all cases.
 
 So yeah, I think this makes sense. Maybe we should add a comment to the
 if though to explain this?
 
 johannes

Actually I think we may need to take a different approach.  The reason I
was in this code was to take a look at a possible refcount issue.

I'll be submitting another patch in a few minutes and will probably be
dropping some of this code anyway.

Thanks,

Alex
--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 0/2] Address reference counting issues with sock_queue_err_skb

2014-09-10 Thread Alexander Duyck
After looking over the code for skb_clone_sk after some comments made by
Eric Dumazet I have come to the conclusion that skb_clone_sk is taking the
correct approach in how to handle the sk_refcnt when creating a buffer that
is eventually meant to be returned to the socket via the sock_queue_err_skb
function.

However upon review of other callers I found what I believe to be a
possible reference count issue in the path for handling wifi ack packets.
To address this I have applied the same logic that is currently in place so
that the sk_refcnt will be forced to stay at least 1, or we will not
provide an skb to return in the sk_error_queue.

---

Alexander Duyck (2):
  skb: Add documentation for skb_clone_sk
  mac80211: Resolve sk_refcnt/sk_wmem_alloc issue in wifi ack path


 net/core/skbuff.c |   18 ++
 net/mac80211/tx.c |   15 ---
 2 files changed, 22 insertions(+), 11 deletions(-)

-- 
--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v2 1/2] skb: Add documentation for skb_clone_sk

2014-09-10 Thread Alexander Duyck
This change adds some documentation to the call skb_clone_sk.  This is
meant to help clarify the purpose of the function for other developers.

Signed-off-by: Alexander Duyck alexander.h.du...@intel.com
---

v2: Updated comments to specifically call out need for sock_hold/sock_put

 net/core/skbuff.c |   13 +
 1 file changed, 13 insertions(+)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index a18dfb0..c9da77a 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3511,6 +3511,19 @@ struct sk_buff *sock_dequeue_err_skb(struct sock *sk)
 }
 EXPORT_SYMBOL(sock_dequeue_err_skb);
 
+/**
+ * skb_clone_sk - create clone of skb, and take reference to socket
+ * @skb: the skb to clone
+ *
+ * This function creates a clone of a buffer that holds a reference on
+ * sk_refcnt.  Buffers created via this function are meant to be
+ * returned using sock_queue_err_skb, or free via kfree_skb.
+ *
+ * When passing buffers allocated with this function to sock_queue_err_skb
+ * it is necessary to wrap the call with sock_hold/sock_put in order to
+ * prevent the socket from being released prior to being enqueued on
+ * the sk_error_queue.
+ */
 struct sk_buff *skb_clone_sk(struct sk_buff *skb)
 {
struct sock *sk = skb-sk;

--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html