Re: 4.3+: Atheros ethernet fails after resume from s2ram, due to order 4 allocation

2015-11-28 Thread Pavel Machek
Hi!


> > /*  
> > 
> >  * real ring DMA buffer 
> > 
> >  * each ring/block may need up to 8 bytes for alignment, hence the  
> > 
> >  * additional bytes tacked onto the end.
> > 
> >  */
> > ring_header->size = size =
> > sizeof(struct atl1c_tpd_desc) * tpd_ring->count * 2 +
> > sizeof(struct atl1c_rx_free_desc) * rx_desc_count +
> > sizeof(struct atl1c_recv_ret_status) * rx_desc_count +
> > 8 * 4;
> > 
> > ring_header->desc = pci_alloc_consistent(pdev, ring_header->size,
> > _header->dma);
> 
> Why is pci_alloc_consistent doing an unconditional GFP_ATOMIC
> allocation? atl1_setup_ring_resources already does GFP_KERNEL
> allocation in the same function so this should be sleepable
> context. I think we should either add pci_alloc_consistent_gfp if
> there are no explicit reasons to not do so or you can workaround

There's existing interface "dma_alloc_coherent" which can be used.

I did not yet try with __GFP_REPEAT, but GFP_KERNEL should already be
big improvement.

Let me send a patch..
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 4.3+: Atheros ethernet fails after resume from s2ram, due to order 4 allocation

2015-11-27 Thread Michal Hocko
On Thu 26-11-15 17:34:13, Pavel Machek wrote:
> Hi!
> 
> ...and dmesg tells us what is going on:
> 
> [ 6961.550240] NetworkManager: page allocation failure: order:4,
> mode:0x2080020

This is GFP_ATOMIC|___GFP_RECLAIMABLE high order request. So something
that the caller should tollerate to fail.

> [ 6961.550249] CPU: 0 PID: 2590 Comm: NetworkManager Tainted: G
> W   4.3.0+ #124
> [ 6961.550250] Hardware name: Acer Aspire 5732Z/Aspire 5732Z, BIOS
> V3.07 02/10/2010
> [ 6961.550252]    f2ad1a04 c42ba5b8  f2ad1a2c
> c40d650a c4d3ee1c
> [ 6961.550260]  f34ef600 0004 02080020 c4eeef40  0010
>  f2ad1ac8
> [ 6961.550266]  c40d8caa 02080020 0004  0070 f34ef200
> 0060 0010
> [ 6961.550272] Call Trace:
> ...[ 6961.550299]  [] dma_generic_alloc_coherent+0x71/0x120
> [ 6961.550301]  [] ? via_no_dac+0x30/0x30
> [ 6961.550307]  [] atl1c_open+0x29e/0x300
> [ 6961.550313]  [] ? call_netdevice_notifiers_info+0x25/0x50
> [ 6961.550316]  [] __dev_open+0x7b/0xf0
> [ 6961.550318]  [] __dev_change_flags+0x89/0x140
> [ 6961.550320]  [] dev_change_flags+0x23/0x60
> [ 6961.550325]  [] do_setlink+0x286/0x7b0
> [ 6961.550328]  [] ? nla_parse+0x22/0xd0
> [ 6961.550330]  [] rtnl_newlink+0x5d6/0x860
> [ 6961.550336]  [] ? __lock_acquire.isra.24+0x3a1/0xc80
> [ 6961.550342]  [] ? ns_capable+0x22/0x60
> [ 6961.550345]  [] ? __netlink_ns_capable+0x2d/0x40
> [ 6961.550351]  [] ? xprt_transmit+0x94/0x220
> [ 6961.550354]  [] rtnetlink_rcv_msg+0x76/0x1f0
> [ 6961.550356]  [] ? rtnetlink_rcv+0x30/0x30
> [ 6961.550359]  [] netlink_rcv_skb+0x8e/0xb0
> ...
> [ 6961.550412] Mem-Info:
> [ 6961.550417] active_anon:30319 inactive_anon:25075 isolated_anon:0
>  active_file:327764 inactive_file:152179 isolated_file:16
>   unevictable:0 dirty:6 writeback:0 unstable:0
>slab_reclaimable:149091 slab_unreclaimable:18973
> mapped:18100 shmem:4847 pagetables:1538 bounce:0
>  free:57732 free_pcp:10 free_cma:0
> ...
> [ 6961.550492] 485897 total pagecache pages
> [ 6961.550494] 1086 pages in swap cache
> [ 6961.550496] Swap cache stats: add 16738, delete 15652, find
> 6708/8500
> [ 6961.550497] Free swap  = 1656440kB
> [ 6961.550498] Total swap = 1681428kB
> [ 6961.550499] 785914 pages RAM
> [ 6961.550500] 557663 pages HighMem/MovableOnly
> [ 6961.550501] 12639 pages reserved
> [ 6961.550506] atl1c :05:00.0: pci_alloc_consistend failed
> [ 6962.148358] psmouse serio1: synaptics: queried max coordinates: x
> [..5772], y [..5086]
> 
> Order 4 allocation... probably doable during boot, but not really
> suitable during resume.
> 
> I'm not sure how repeatable it is, but it definitely happened more
> than once.
> 
> /*
>   
>  * real ring DMA buffer   
>   
>  * each ring/block may need up to 8 bytes for alignment, hence the
>   
>  * additional bytes tacked onto the end.  
>   
>  */
> ring_header->size = size =
> sizeof(struct atl1c_tpd_desc) * tpd_ring->count * 2 +
> sizeof(struct atl1c_rx_free_desc) * rx_desc_count +
> sizeof(struct atl1c_recv_ret_status) * rx_desc_count +
> 8 * 4;
> 
> ring_header->desc = pci_alloc_consistent(pdev, ring_header->size,
> _header->dma);

Why is pci_alloc_consistent doing an unconditional GFP_ATOMIC
allocation? atl1_setup_ring_resources already does GFP_KERNEL
allocation in the same function so this should be sleepable
context. I think we should either add pci_alloc_consistent_gfp if
there are no explicit reasons to not do so or you can workaround
that by opencoding it and using dma_alloc_coherent directly with
GFP_KERNEL|__GFP_REPEAT. This doesn't guarantee a success though
because this is > PAGE_ALLOC_COSTLY_ORDER but it would increase chances
considerably. Also a vmalloc fallback can be used then more safely.

> if (unlikely(!ring_header->desc)) {
> dev_err(>dev, "pci_alloc_consistend failed\n");
> goto err_nomem;
> }
> 
> (Note the typo in dev_err... at least it is easy to grep).
> 
> Ok, so what went on is easy.. any ideas how to fix it?

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 4.3+: Atheros ethernet fails after resume from s2ram, due to order 4 allocation

2015-11-26 Thread Francois Romieu
Pavel Machek  :
[...]
> Ok, so what went on is easy.. any ideas how to fix it ?

The driver should 1) prohibit holes in its receive ring, 2) allocate before
pushing data up in the stack 3) drop packets when it can't allocate a
fresh buffer and 4) stop releasing receive buffers - and any resource for
that matter - during suspend.

Really.

-- 
Ueimor
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


4.3+: Atheros ethernet fails after resume from s2ram, due to order 4 allocation

2015-11-26 Thread Pavel Machek
Hi!

...and dmesg tells us what is going on:

[ 6961.550240] NetworkManager: page allocation failure: order:4,
mode:0x2080020
[ 6961.550249] CPU: 0 PID: 2590 Comm: NetworkManager Tainted: G
W   4.3.0+ #124
[ 6961.550250] Hardware name: Acer Aspire 5732Z/Aspire 5732Z, BIOS
V3.07 02/10/2010
[ 6961.550252]    f2ad1a04 c42ba5b8  f2ad1a2c
c40d650a c4d3ee1c
[ 6961.550260]  f34ef600 0004 02080020 c4eeef40  0010
 f2ad1ac8
[ 6961.550266]  c40d8caa 02080020 0004  0070 f34ef200
0060 0010
[ 6961.550272] Call Trace:
...[ 6961.550299]  [] dma_generic_alloc_coherent+0x71/0x120
[ 6961.550301]  [] ? via_no_dac+0x30/0x30
[ 6961.550307]  [] atl1c_open+0x29e/0x300
[ 6961.550313]  [] ? call_netdevice_notifiers_info+0x25/0x50
[ 6961.550316]  [] __dev_open+0x7b/0xf0
[ 6961.550318]  [] __dev_change_flags+0x89/0x140
[ 6961.550320]  [] dev_change_flags+0x23/0x60
[ 6961.550325]  [] do_setlink+0x286/0x7b0
[ 6961.550328]  [] ? nla_parse+0x22/0xd0
[ 6961.550330]  [] rtnl_newlink+0x5d6/0x860
[ 6961.550336]  [] ? __lock_acquire.isra.24+0x3a1/0xc80
[ 6961.550342]  [] ? ns_capable+0x22/0x60
[ 6961.550345]  [] ? __netlink_ns_capable+0x2d/0x40
[ 6961.550351]  [] ? xprt_transmit+0x94/0x220
[ 6961.550354]  [] rtnetlink_rcv_msg+0x76/0x1f0
[ 6961.550356]  [] ? rtnetlink_rcv+0x30/0x30
[ 6961.550359]  [] netlink_rcv_skb+0x8e/0xb0
...
[ 6961.550412] Mem-Info:
[ 6961.550417] active_anon:30319 inactive_anon:25075 isolated_anon:0
 active_file:327764 inactive_file:152179 isolated_file:16
  unevictable:0 dirty:6 writeback:0 unstable:0
   slab_reclaimable:149091 slab_unreclaimable:18973
mapped:18100 shmem:4847 pagetables:1538 bounce:0
 free:57732 free_pcp:10 free_cma:0
...
[ 6961.550492] 485897 total pagecache pages
[ 6961.550494] 1086 pages in swap cache
[ 6961.550496] Swap cache stats: add 16738, delete 15652, find
6708/8500
[ 6961.550497] Free swap  = 1656440kB
[ 6961.550498] Total swap = 1681428kB
[ 6961.550499] 785914 pages RAM
[ 6961.550500] 557663 pages HighMem/MovableOnly
[ 6961.550501] 12639 pages reserved
[ 6961.550506] atl1c :05:00.0: pci_alloc_consistend failed
[ 6962.148358] psmouse serio1: synaptics: queried max coordinates: x
[..5772], y [..5086]

Order 4 allocation... probably doable during boot, but not really
suitable during resume.

I'm not sure how repeatable it is, but it definitely happened more
than once.

/*  
 * real ring DMA buffer 
 * each ring/block may need up to 8 bytes for alignment, hence the  
 * additional bytes tacked onto the end.
 */
ring_header->size = size =
sizeof(struct atl1c_tpd_desc) * tpd_ring->count * 2 +
sizeof(struct atl1c_rx_free_desc) * rx_desc_count +
sizeof(struct atl1c_recv_ret_status) * rx_desc_count +
8 * 4;

ring_header->desc = pci_alloc_consistent(pdev, ring_header->size,
_header->dma);
if (unlikely(!ring_header->desc)) {
dev_err(>dev, "pci_alloc_consistend failed\n");
goto err_nomem;
}

(Note the typo in dev_err... at least it is easy to grep).

Ok, so what went on is easy.. any ideas how to fix it?
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html