Re: [igb] netconsole triggers warning in netpoll_poll_dev

2021-04-06 Thread Oleksandr Natalenko
Hello. On Tue, Apr 06, 2021 at 11:48:02AM -0700, Jakub Kicinski wrote: > On Tue, 6 Apr 2021 14:36:19 +0200 Oleksandr Natalenko wrote: > > Hello. > > > > I've raised this here [1] first, but was suggested to engage igb devs, > > so here we are. > > > &g

[igb] netconsole triggers warning in netpoll_poll_dev

2021-04-06 Thread Oleksandr Natalenko
[2] https://bugzilla.kernel.org/show_bug.cgi?id=211911 -- Oleksandr Natalenko (post-factum)

Re: mt7612 suspend/resume issue

2020-06-22 Thread Oleksandr Natalenko
.c @@ -119,9 +119,8 @@ mt76x2e_suspend(struct pci_dev *pdev, pm_message_t state) mt76x02_dma_reset(dev); -pci_enable_wake(pdev, pci_choose_state(pdev, state), true); pci_save_state(pdev); -err = pci_set_power_state(pdev, pci_choose_state(pdev, state)); +err = pci_set_power_state(pdev, PCI_D0); if (err) goto restore; ? -- Best regards, Oleksandr Natalenko (post-factum) Principal Software Maintenance Engineer

Re: mt7612 suspend/resume issue

2020-06-22 Thread Oleksandr Natalenko
struct mt76x02_dev *dev = container_of(mdev, struct mt76x02_dev, > > > mt76); > > > + int i, err; > > can you please double-check what is the PCI state requested during suspend? Do you mean ACPI S3 (this is the state the system enters)? If not, what should I check and wher

Re: mt7612 suspend/resume issue

2020-06-19 Thread Oleksandr Natalenko
ieee80211_restart_work+0xb7/0xe0 [mac80211] čen 18 23:12:02 spock kernel: process_one_work+0x1d4/0x3c0 čen 18 23:12:02 spock kernel: worker_thread+0x228/0x470 čen 18 23:12:02 spock kernel: ? process_one_work+0x3c0/0x3c0 čen 18 23:12:02 spock kernel: kthread+0x19c/0x1c0 čen 18 23:12:02 spock kernel: ? __kthread_init_worker+0x30/0x30 čen 18 23:12:02 spock kernel: ret_from_fork+0x35/0x40 čen 18 23:12:02 spock kernel: ---[ end trace e017bc3573bd9bf3 ]--- === Do you still want me to try Felix's tree, or there's something else I can try? Thank you. -- Best regards, Oleksandr Natalenko (post-factum) Principal Software Maintenance Engineer

mt7612 suspend/resume issue

2020-06-18 Thread Oleksandr Natalenko
with v5.7 kernel series only. Do you have any idea what could go wrong and how to approach the issue? Thanks. -- Best regards, Oleksandr Natalenko (post-factum) Principal Software Maintenance Engineer

Re: WARN_ON() in netconsole with PREEMPT_RT

2018-11-24 Thread Oleksandr Natalenko
Hi. On 12.11.2018 03:01, Steven Rostedt wrote: On Sun, 11 Nov 2018 21:16:00 +0100 Oleksandr Natalenko wrote: Oh, I see that write_msg() calls netpoll_send_udp() under spin_lock_irqsave(), but in PREEMPT_RT this, AFAIK, does not disable interrupts. So, the real question here is whether the

Re: [PATCH net-next 0/6] tcp: remove non GSO code

2018-02-20 Thread Oleksandr Natalenko
Hi. On středa 21. února 2018 0:21:37 CET Eric Dumazet wrote: > My latest patch (fixing BBR underestimation of cwnd) > was meant for net tree, on a NIC where SG/TSO/GSO) are disabled. > > ( ie when sk->sk_gso_max_segs is not set to 'infinite' ) > > It is packet scheduler independent really. > >

Re: [PATCH net-next 0/6] tcp: remove non GSO code

2018-02-20 Thread Oleksandr Natalenko
On úterý 20. února 2018 21:09:37 CET Eric Dumazet wrote: > Also you can tune your NIC to accept few MSS per GSO/TSO packet > > ip link set dev eth0 gso_max_segs 2 > > So even if TSO/GSO is there, BBR should not use sk->sk_gso_max_segs to > size its bursts, since burt sizes are also impacting GRO

Re: [PATCH net-next 0/6] tcp: remove non GSO code

2018-02-20 Thread Oleksandr Natalenko
On úterý 20. února 2018 20:56:24 CET Eric Dumazet wrote: > That is with the other patches _not_ applied ? Yes, other patches are not applied. It is v4.15.4 + this patch only + BBR + fq_codel or pfifo_fast. Shall I re-test it on the net-next with the whole patchset (because it is not applied clea

Re: [PATCH net-next 0/6] tcp: remove non GSO code

2018-02-20 Thread Oleksandr Natalenko
On úterý 20. února 2018 20:39:49 CET Eric Dumazet wrote: > I am not trying to compare BBR and Reno on a lossless link. > > Reno is running as fast as possible and will win when bufferbloat is > not an issue. > > If bufferbloat is not an issue, simply use Reno and be happy ;) > > My patch helps B

Re: [PATCH net-next 0/6] tcp: remove non GSO code

2018-02-20 Thread Oleksandr Natalenko
Hi. On úterý 20. února 2018 19:57:42 CET Eric Dumazet wrote: > Actually timer drifts are not horrible (at least on my lab hosts) > > But BBR has a pessimistic way to sense the burst size, as it is tied to > TSO/GSO being there. > > Following patch helps a lot. Not really, at least if applied to

Re: [PATCH net-next 0/6] tcp: remove non GSO code

2018-02-20 Thread Oleksandr Natalenko
s less packets, SACK is cheaper. 8) Removal of legacy code. Less maintenance hassles. Note that I have left the sendpage/zerocopy paths, but they probably can benefit from the same strategy. Thanks to Oleksandr Natalenko for reporting a performance issue for BBR/fq_codel, which was the main re

Re: TCP and BBR: reproducibly low cwnd and bandwidth

2018-02-18 Thread Oleksandr Natalenko
Hi. On neděle 18. února 2018 22:04:27 CET Eric Dumazet wrote: > I was able to take a look today, and I believe this is the time to > switch TCP to GSO being always on. > > As a bonus, we get speed boost for cubic as well. > > Todays high BDP and recent TCP improvements (rtx queue as rb-tree, sac

Re: TCP and BBR: reproducibly low cwnd and bandwidth

2018-02-17 Thread Oleksandr Natalenko
Hi. On pátek 16. února 2018 23:59:52 CET Eric Dumazet wrote: > Well, no effect here on e1000e (1 Gbit) at least > > # ethtool -K eth3 sg off > Actual changes: > scatter-gather: off > tx-scatter-gather: off > tcp-segmentation-offload: off > tx-tcp-segmentation: off [requested on] > tx-tcp6-segmen

Re: TCP and BBR: reproducibly low cwnd and bandwidth

2018-02-16 Thread Oleksandr Natalenko
On pátek 16. února 2018 23:50:35 CET Eric Dumazet wrote: > /* snip */ > If you use > > tcptrace -R test_s2c.pcap > xplot.org d2c_rtt.xpl > > Then you'll see plenty of suspect 40ms rtt samples. That's odd. Even the way how they look uniformly. > It looks like receiver misses wakeups for some rea

Re: TCP and BBR: reproducibly low cwnd and bandwidth

2018-02-16 Thread Oleksandr Natalenko
Hi. On pátek 16. února 2018 21:54:05 CET Eric Dumazet wrote: > /* snip */ > Something fishy really : > /* snip */ > Not only the receiver suddenly adds a 25 ms delay, but also note that > it acknowledges all prior segments (ack 112949), but with a wrong ecr > value ( 2327043753 ) > instead of 2327

Re: TCP and BBR: reproducibly low cwnd and bandwidth

2018-02-16 Thread Oleksandr Natalenko
Hi. On pátek 16. února 2018 18:56:12 CET Holger Hoffstätte wrote: > There is simply no reason why you shouldn't get approx. line rate > (~920+-ish) Mbit over wired 1GBit Ethernet; even my broken 10-year old > Core2Duo laptop can do that. Can you boot with spectre_v2=off and try "the > simplest cas

Re: TCP and BBR: reproducibly low cwnd and bandwidth

2018-02-16 Thread Oleksandr Natalenko
Hi. On pátek 16. února 2018 17:25:58 CET Eric Dumazet wrote: > The way TCP pacing works, it defaults to internal pacing using a hint > stored in the socket. > > If you change the qdisc while flow is alive, result could be unexpected. I don't change a qdisc while flow is alive. Either the VM is c

Re: TCP and BBR: reproducibly low cwnd and bandwidth

2018-02-16 Thread Oleksandr Natalenko
Hi. On pátek 16. února 2018 17:26:11 CET Holger Hoffstätte wrote: > These are very odd configurations. :) > Non-preempt/100 might well be too slow, whereas PREEMPT/1000 might simply > have too much overhead. Since the pacing is based on hrtimers, should HZ matter at all? Even if so, poor 1 Gbps

Re: TCP and BBR: reproducibly low cwnd and bandwidth

2018-02-16 Thread Oleksandr Natalenko
Hi. On pátek 16. února 2018 17:33:48 CET Neal Cardwell wrote: > Thanks for the detailed report! Yes, this sounds like an issue in BBR. We > have not run into this one in our team, but we will try to work with you to > fix this. > > Would you be able to take a sender-side tcpdump trace of the slow

Re: TCP and BBR: reproducibly low cwnd and bandwidth

2018-02-16 Thread Oleksandr Natalenko
Hi! On pátek 16. února 2018 17:45:56 CET Neal Cardwell wrote: > Eric raises a good question: bare metal vs VMs. > > Oleksandr, your first email mentioned KVM VMs and virtio NICs. Your > second e-mail did not seem to mention if those results were for bare > metal or a VM scenario: can you please c

Re: TCP and BBR: reproducibly low cwnd and bandwidth

2018-02-16 Thread Oleksandr Natalenko
Hi, David, Eric, Neal et al. On čtvrtek 15. února 2018 21:42:26 CET Oleksandr Natalenko wrote: > I've faced an issue with a limited TCP bandwidth between my laptop and a > server in my 1 Gbps LAN while using BBR as a congestion control mechanism. > To verify my observations, I'

TCP and BBR: reproducibly low cwnd and bandwidth

2018-02-15 Thread Oleksandr Natalenko
Hello. I've faced an issue with a limited TCP bandwidth between my laptop and a server in my 1 Gbps LAN while using BBR as a congestion control mechanism. To verify my observations, I've set up 2 KVM VMs with the following parameters: 1) Linux v4.15.3 2) virtio NICs 3) 128 MiB of RAM 4) 2 vCPUs

Re: [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c

2017-11-10 Thread Oleksandr Natalenko
Uhh, sorry, just found the original submission [1]. [1] https://marc.info/?l=linux-netdev&m=151009763926816&w=2 10.11.2017 14:15, Oleksandr Natalenko wrote: Hi. I'm running the machine with this patch applied for 7 hours now, and the warning hasn't appeared yet. Typically,

Re: [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c

2017-11-10 Thread Oleksandr Natalenko
Hi. I'm running the machine with this patch applied for 7 hours now, and the warning hasn't appeared yet. Typically, it should be there within the first hour. I'll keep an eye on it for a longer time, but as of now it looks good. Some explanation on this please? Thanks! 06.11.2017 23:27, Y

Re: [PATCH net] tcp: fix tcp_mtu_probe() vs highest_sack

2017-11-03 Thread Oleksandr Natalenko
ighest_sack regardless of whatever > condition, since keeping a stale pointer to freed skb is a recipe > for disaster. > > Fixes: a47e5a988a57 ("[TCP]: Convert highest_sack to sk_buff to allow direct > access") Signed-off-by: Eric Dumazet > Reported-by: Alexei Starovoit

Re: [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c

2017-09-28 Thread Oleksandr Natalenko
Hi. Won't tell about panic in tcp_sacktag_walk() since I cannot trigger it intentionally, but setting net.ipv4.tcp_retrans_collapse to 0 *does not* fix warning in tcp_fastretrans_alert() for me. On středa 27. září 2017 2:18:32 CEST Yuchung Cheng wrote: > On Tue, Sep 26, 2017 at 5:12 PM, Yuchung

Re: [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c

2017-09-19 Thread Oleksandr Natalenko
Mon, Sep 18, 2017 at 1:46 PM, Oleksandr Natalenko > > wrote: > > Actually, same warning was just triggered with RACK enabled. But main > > warning was not triggered in this case. > > Thanks. > > I assume this kernel does not have the patch that Neal proposed in

Re: [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c

2017-09-19 Thread Oleksandr Natalenko
Hi. 18.09.2017 23:40, Yuchung Cheng wrote: I assume this kernel does not have the patch that Neal proposed in his first reply? Correct. The main warning needs to be triggered by another peculiar SACK that kicks the sender into recovery again (after undo). Please let it run longer if possible

Re: [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c

2017-09-18 Thread Oleksandr Natalenko
fb eb b3 <0f> ff 5b 5d c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f Sep 18 22:44:32 defiant kernel: ---[ end trace 1aea180efeedb474 ]--- === On pondělí 18. září 2017 20:01:42 CEST Yuchung Cheng wrote: > On Mon, Sep 18, 2017 at 10:59 AM, Oleksandr Natalenko > > wrote: > &

Re: [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c

2017-09-18 Thread Oleksandr Natalenko
18, 2017 at 10:59 AM, Oleksandr Natalenko > > wrote: > > OK. Should I keep FACK disabled? > > Yes since it is disabled in the upstream by default. Although you can > experiment FACK enabled additionally. > > Do we know the crash you first experienced is tied to this is

Re: [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c

2017-09-18 Thread Oleksandr Natalenko
On pondělí 18. září 2017 20:01:42 CEST Yuchung Cheng wrote: > Yes since it is disabled in the upstream by default. Although you can > experiment FACK enabled additionally. OK. > Do we know the crash you first experienced is tied to this issue? No, unfortunately. I wasn't able to re-create it aga

Re: [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c

2017-09-18 Thread Oleksandr Natalenko
OK. Should I keep FACK disabled? On pondělí 18. září 2017 19:51:21 CEST Yuchung Cheng wrote: > Can you try this patch to verify my theory with tcp_recovery=0 and 1? thanks > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index 5af2f04f8859..9253d9ee7d0e 100644 > --- a/net/ipv4/tcp_i

Re: [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c

2017-09-17 Thread Oleksandr Natalenko
Hi. Just to note that it looks like disabling RACK and re-enabling FACK prevents warning from happening: net.ipv4.tcp_fack = 1 net.ipv4.tcp_recovery = 0 Hope I get semantics of these tunables right. On pátek 15. září 2017 21:04:36 CEST Oleksandr Natalenko wrote: > Hello. > > With

Re: [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c

2017-09-15 Thread Oleksandr Natalenko
Hello. With net.ipv4.tcp_fack set to 0 the warning still appears: === » sysctl net.ipv4.tcp_fack net.ipv4.tcp_fack = 0 » LC_TIME=C dmesg -T | grep WARNING [Fri Sep 15 20:40:30 2017] WARNING: CPU: 1 PID: 711 at net/ipv4/tcp_input.c: 2826 tcp_fastretrans_alert+0x7c8/0x990 [Fri Sep 15 20:40:30

Re: [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c

2017-09-14 Thread Oleksandr Natalenko
Hi. I've applied your test patch but it doesn't fix the issue for me since the warning is still there. Were you able to reproduce it? On pondělí 11. září 2017 1:59:02 CEST Neal Cardwell wrote: > Thanks for the detailed report! > > I suspect this is due to the following commit, which happened b

[REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c

2017-09-10 Thread Oleksandr Natalenko
Hello. Since, IIRC, v4.11, there is some regression in TCP stack resulting in the warning shown below. Most of the time it is harmless, but rarely it just causes either freeze or (I believe, this is related too) panic in tcp_sacktag_walk() (because sk_buff passed to this function is NULL). Unf

kernel BUG at net/netfilter/nf_nat_core.c:395

2016-02-10 Thread Oleksandr Natalenko
Hi. With 4.4.1 I've got BUG_ON() triggered in net/netfilter/nf_nat_core.c:395, nf_nat_setup_info(), today on my home router. Here is full trace got via netconsole: [1] I perform LAN NATting using nftables like this: === table ip nat { chain prerouting { type nat hook pr

Re: [REGRESSION] tcp/ipv4: kernel panic because of (possible) division by zero

2016-01-06 Thread Oleksandr Natalenko
n, Dec 21, 2015 at 12:25 PM, Oleksandr Natalenko > >> > > >> > wrote: > >> >> Commit 3759824da87b30ce7a35b4873b62b0ba38905ef5 (tcp: PRR uses CRB > >> > > >> >mode by > >> > > >> >> default and SS mode conditional

Re: [REGRESSION] tcp/ipv4: kernel panic because of (possible) division by zero

2016-01-06 Thread Oleksandr Natalenko
hung Cheng wrote: >On Mon, Dec 21, 2015 at 12:25 PM, Oleksandr Natalenko > wrote: >> Commit 3759824da87b30ce7a35b4873b62b0ba38905ef5 (tcp: PRR uses CRB >mode by >> default and SS mode conditionally) introduced changes to >net/ipv4/tcp_input.c >> tcp_cwnd_reduction() tha

Re: [REGRESSION] tcp/ipv4: kernel panic because of (possible) division by zero

2015-12-22 Thread Oleksandr Natalenko
wrote: > On Mon, Dec 21, 2015 at 12:25 PM, Oleksandr Natalenko > > wrote: > > Commit 3759824da87b30ce7a35b4873b62b0ba38905ef5 (tcp: PRR uses CRB mode by > > default and SS mode conditionally) introduced changes to > > net/ipv4/tcp_input.c tcp_cwnd_reduction() that, poss

[REGRESSION] tcp/ipv4: kernel panic because of (possible) division by zero

2015-12-21 Thread Oleksandr Natalenko
Commit 3759824da87b30ce7a35b4873b62b0ba38905ef5 (tcp: PRR uses CRB mode by default and SS mode conditionally) introduced changes to net/ipv4/tcp_input.c tcp_cwnd_reduction() that, possibly, cause division by zero, and therefore, kernel panic in interrupt handler [1]. Reverting 3759824da87b30ce7