On Thu, 2015-02-12 at 08:48 +0100, Michal Kazior wrote:
Good point. I was actually thinking about it. I can try cooking a
patch unless you want to do it yourself :-)
I've taken a look into this. The most obvious place to add the
timestamp for each packet would be ieee80211_tx_info (i.e.
On Tue, 2015-02-24 at 11:30 +0100, Johannes Berg wrote:
On Tue, 2015-02-24 at 11:24 +0100, Johannes Berg wrote:
On Thu, 2015-02-12 at 08:48 +0100, Michal Kazior wrote:
Good point. I was actually thinking about it. I can try cooking a
patch unless you want to do it yourself :-)
On Fri, Feb 6, 2015 at 1:57 AM, Michal Kazior michal.kaz...@tieto.com wrote:
On 5 February 2015 at 20:50, Dave Taht dave.t...@gmail.com wrote:
[...]
And I really, really, really wish, that just once during this thread,
someone had bothered to try running a test
at a real world MCS rate - say
On Wed, Feb 11, 2015 at 11:48 PM, Michal Kazior michal.kaz...@tieto.com wrote:
On 11 February 2015 at 09:57, Michal Kazior michal.kaz...@tieto.com wrote:
On 10 February 2015 at 15:19, Johannes Berg johan...@sipsolutions.net
wrote:
On Tue, 2015-02-10 at 11:33 +0100, Michal Kazior wrote:
+
On 11 February 2015 at 09:57, Michal Kazior michal.kaz...@tieto.com wrote:
On 10 February 2015 at 15:19, Johannes Berg johan...@sipsolutions.net wrote:
On Tue, 2015-02-10 at 11:33 +0100, Michal Kazior wrote:
+ if (msdu-sk) {
+ ewma_add(ar-tx_delay_us,
+
On 11 February 2015 at 14:17, Eric Dumazet eric.duma...@gmail.com wrote:
On Wed, 2015-02-11 at 09:33 +0100, Michal Kazior wrote:
If I set tcp_limit_output_bytes to 700K+ I can get ath10k w/ cushion
w/ aggregation to reach 600mbps on a single flow.
You know, there is a reason this sysctl
On Wed, 2015-02-11 at 09:33 +0100, Michal Kazior wrote:
If I set tcp_limit_output_bytes to 700K+ I can get ath10k w/ cushion
w/ aggregation to reach 600mbps on a single flow.
You know, there is a reason this sysctl exists in the first place ;)
The first suggestion I made to you was to raise
On Tue, 2015-02-10 at 15:19 +0100, Johannes Berg wrote:
On Tue, 2015-02-10 at 11:33 +0100, Michal Kazior wrote:
+ if (msdu-sk) {
+ ewma_add(ar-tx_delay_us,
+ktime_to_ns(ktime_sub(ktime_get(), skb_cb-stamp))
/
+
On 9 February 2015 at 16:11, Eric Dumazet eric.duma...@gmail.com wrote:
On Mon, 2015-02-09 at 14:47 +0100, Michal Kazior wrote:
[...]
This is not what I suggested.
If you test this on any other network device, you'll have
sk-sk_tx_completion_delay_us == 0
amount = 0 * (sk-sk_pacing_rate
On Tue, 2015-02-10 at 04:54 -0800, Eric Dumazet wrote:
Hi Michal
This is almost it ;)
As I said you must do this using u64 arithmetics, we still support 32bit
kernels.
Also, 20 instead of / 100 introduces a 5% error, I would use a
plain divide, as the compiler will use a
On Tue, 2015-02-10 at 11:33 +0100, Michal Kazior wrote:
+ if (msdu-sk) {
+ ewma_add(ar-tx_delay_us,
+ktime_to_ns(ktime_sub(ktime_get(), skb_cb-stamp)) /
+NSEC_PER_USEC);
+
+
On Mon, 2015-02-09 at 14:47 +0100, Michal Kazior wrote:
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 65caf8b..5e249bf 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1996,6 +1996,7 @@ static bool tcp_write_xmit(struct sock *sk,
unsigned int mss_now,
On 6 February 2015 at 15:09, Michal Kazior michal.kaz...@tieto.com wrote:
On 6 February 2015 at 14:53, Eric Dumazet eric.duma...@gmail.com wrote:
On Fri, 2015-02-06 at 05:40 -0800, Eric Dumazet wrote:
tcp_wfree() could maintain in tp-tx_completion_delay_ms an EWMA
of TX completion delay. But
On 5 February 2015 at 20:50, Dave Taht dave.t...@gmail.com wrote:
[...]
And I really, really, really wish, that just once during this thread,
someone had bothered to try running a test
at a real world MCS rate - say MCS1, or MCS4, and measured the latency
under load of that...
Time between
On 5 February 2015 at 18:10, Eric Dumazet eric.duma...@gmail.com wrote:
On Thu, 2015-02-05 at 06:41 -0800, Eric Dumazet wrote:
Not at all. This basically removes backpressure.
A single UDP socket can now blast packets regardless of SO_SNDBUF
limits.
This basically remove years of work
On 6 February 2015 at 14:40, Eric Dumazet eric.duma...@gmail.com wrote:
On Fri, 2015-02-06 at 10:42 +0100, Michal Kazior wrote:
The above brings back previous behaviour, i.e. I can get 600mbps TCP
on 5 flows again. Single flow is still (as it was before TSO
autosizing) limited to roughly
On 6 February 2015 at 14:53, Eric Dumazet eric.duma...@gmail.com wrote:
On Fri, 2015-02-06 at 05:40 -0800, Eric Dumazet wrote:
tcp_wfree() could maintain in tp-tx_completion_delay_ms an EWMA
of TX completion delay. But this would require yet another expensive
call to ktime_get() if HZ 1000.
On Fri, 2015-02-06 at 05:53 -0800, Eric Dumazet wrote:
wifi could eventually do that, providing in skb-tx_completion_delay_us
the time spent in wifi driver.
This way, we would have no penalty for network devices doing normal skb
orphaning (loopback interface, ethernet, ...)
Another way
On Fri, 2015-02-06 at 15:08 +0100, Michal Kazior wrote:
Hmm.. I confirm it works. However the value at which I get full rate
on a single flow is more than 2048K. Also using non-default
wmem_default seems to introduce packet loss as per iperf reports at
the receiver. I suppose this is kind of
On 05/02/2015 15:48, Eric Dumazet wrote:
On Thu, 2015-02-05 at 14:44 +0100, Michal Kazior wrote:
I do get your point. But 1.5ms is really tough on Wi-Fi.
Just look at this:
; ping 192.168.1.2 -c 3
PING 192.168.1.2 (192.168.1.2) 56(84) bytes of data.
64 bytes from 192.168.1.2: icmp_seq=1
On Fri, 2015-02-06 at 10:42 +0100, Michal Kazior wrote:
The above brings back previous behaviour, i.e. I can get 600mbps TCP
on 5 flows again. Single flow is still (as it was before TSO
autosizing) limited to roughly ~280mbps.
I never really bothered before to understand why I need to push
On Fri, 2015-02-06 at 05:40 -0800, Eric Dumazet wrote:
tcp_wfree() could maintain in tp-tx_completion_delay_ms an EWMA
of TX completion delay. But this would require yet another expensive
call to ktime_get() if HZ 1000.
Then tcp_write_xmit() could use it to adjust :
limit = max(2 *
From: Eric Dumazet
On Fri, 2015-02-06 at 05:53 -0800, Eric Dumazet wrote:
wifi could eventually do that, providing in skb-tx_completion_delay_us
the time spent in wifi driver.
This way, we would have no penalty for network devices doing normal skb
orphaning (loopback interface,
If you increase ability to flood on one flow, then you need to make sure
receiver has big rcvbuf as well.
echo 200 /proc/sys/net/core/rmem_default
Otherwise it might drop bursts.
This is the kind of things that TCP does automatically, not UDP.
An alternative, if the application involved
On Thu, 2015-02-05 at 07:46 +0100, Michal Kazior wrote:
On 4 February 2015 at 22:11, Eric Dumazet eric.duma...@gmail.com wrote:
Most conservative patch would be :
diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c
b/drivers/net/wireless/ath/ath10k/htt_rx.c
index
On Thu, 2015-02-05 at 04:57 -0800, Eric Dumazet wrote:
The intention is to control the queues to the following :
1 ms of buffering, but limited to a configurable value.
On a 40Gbps flow, 1ms represents 5 MB, which is insane.
We do not want to queue 5 MB of traffic, this would destroy
On 5 February 2015 at 14:19, Eric Dumazet eric.duma...@gmail.com wrote:
On Thu, 2015-02-05 at 04:57 -0800, Eric Dumazet wrote:
The intention is to control the queues to the following :
1 ms of buffering, but limited to a configurable value.
On a 40Gbps flow, 1ms represents 5 MB, which is
On Thu, 2015-02-05 at 05:19 -0800, Eric Dumazet wrote:
TCP could eventually dynamically adjust the tcp_limit_output_bytes,
using a per flow dynamic value, but I would rather not add a kludge in
TCP stack only to deal with a possible bug in ath10k driver.
niu has a similar issue and
On Thu, 2015-02-05 at 09:38 +0100, Michal Kazior wrote:
On 4 February 2015 at 22:11, Eric Dumazet eric.duma...@gmail.com wrote:
I do not see how a TSO patch could hurt a flow not using TSO/GSO.
This makes no sense.
Hmm..
@@ -2018,8 +2053,8 @@ static bool tcp_write_xmit(struct sock
On Thu, 2015-02-05 at 14:44 +0100, Michal Kazior wrote:
I do get your point. But 1.5ms is really tough on Wi-Fi.
Just look at this:
; ping 192.168.1.2 -c 3
PING 192.168.1.2 (192.168.1.2) 56(84) bytes of data.
64 bytes from 192.168.1.2: icmp_seq=1 ttl=64 time=1.83 ms
64 bytes from
On 4 February 2015 at 22:11, Eric Dumazet eric.duma...@gmail.com wrote:
I do not see how a TSO patch could hurt a flow not using TSO/GSO.
This makes no sense.
Hmm..
@@ -2018,8 +2053,8 @@ static bool tcp_write_xmit(struct sock *sk,
unsigned int mss_now, int nonagle,
* of
On Thu, 2015-02-05 at 14:44 +0100, Michal Kazior wrote:
Ok. I tried calling skb_orphan() right after I submit each Tx frame
(similar to niu which does this in start_xmit):
--- a/drivers/net/wireless/ath/ath10k/htt_tx.c
+++ b/drivers/net/wireless/ath/ath10k/htt_tx.c
@@ -564,6 +564,8 @@ int
On Thu, 2015-02-05 at 06:41 -0800, Eric Dumazet wrote:
Not at all. This basically removes backpressure.
A single UDP socket can now blast packets regardless of SO_SNDBUF
limits.
This basically remove years of work trying to fix bufferbloat.
I still do not understand why increasing
On Fri, Feb 6, 2015 at 2:44 AM, Michal Kazior michal.kaz...@tieto.com wrote:
On 5 February 2015 at 14:19, Eric Dumazet eric.duma...@gmail.com wrote:
On Thu, 2015-02-05 at 04:57 -0800, Eric Dumazet wrote:
The intention is to control the queues to the following :
1 ms of buffering, but
I do not see how a TSO patch could hurt a flow not using TSO/GSO.
This makes no sense.
ath10k tx completions being batched/deferred to a tasklet might increase
probability to hit this condition in tcp_wfree() :
/* If this softirq is serviced by ksoftirqd, we are likely under stress.
On 4 February 2015 at 22:11, Eric Dumazet eric.duma...@gmail.com wrote:
I do not see how a TSO patch could hurt a flow not using TSO/GSO.
This makes no sense.
ath10k tx completions being batched/deferred to a tasklet might increase
probability to hit this condition in tcp_wfree() :
On Wed, 2015-02-04 at 13:22 +0100, Michal Kazior wrote:
On 4 February 2015 at 12:57, Eric Dumazet eric.duma...@gmail.com wrote:
To disable gso you would have to use :
ethtool -K wlan1 gso off
Oh, thanks! This works. However I can't turn it on:
; ethtool -K wlan1 gso on
Could not
On 4 February 2015 at 13:38, Eric Dumazet eric.duma...@gmail.com wrote:
On Wed, 2015-02-04 at 13:22 +0100, Michal Kazior wrote:
On 4 February 2015 at 12:57, Eric Dumazet eric.duma...@gmail.com wrote:
To disable gso you would have to use :
ethtool -K wlan1 gso off
Oh, thanks! This
On 3 February 2015 at 15:27, Eric Dumazet eric.duma...@gmail.com wrote:
On Tue, 2015-02-03 at 12:50 +0100, Michal Kazior wrote:
[...]
IOW:
- stretch acks / TSO defer don't seem to help much (when compared to
throughput results from yesterday)
- GRO helps
- disabling A-MSDU on sender helps
On Wed, 2015-02-04 at 13:53 +0100, Michal Kazior wrote:
The hardware itself seems to be capable. The firmware is a problem
though. I'm also not sure if mac80211 can handle this as is. No 802.11
driver seems to support SG except wil6210 which uses cfg80211 and
netdevs directly.
mac80211
OK guys
Using a mlx4 testbed I can reproduce the problem by pushing coalescing
settings and disabling SG (thus disabling GSO)
ethtool -K eth0 sg off
Actual changes:
scatter-gather: off
tx-scatter-gather: off
generic-segmentation-offload: off [requested on]
ethtool -C eth0 tx-usecs 1024
On 3 February 2015 at 02:18, Eric Dumazet eric.duma...@gmail.com wrote:
On Mon, 2015-02-02 at 10:52 -0800, Eric Dumazet wrote:
It seems to break ACK clocking badly (linux stack has a somewhat buggy
tcp_tso_should_defer(), which relies on ACK being received smoothly, as
no timer is setup to
On Tue, 2015-02-03 at 06:27 -0800, Eric Dumazet wrote:
Are packets TX completed after a timer or something ?
Some very heavy stuff might run from tasklet (or other softirq triggered)
event.
Right, commit 6c5151a9ffa9f796f2d707617cecb6b6b241dff8
(ath10k: batch htt tx/rx completions)
is
On 2 February 2015 at 19:52, Eric Dumazet eric.duma...@gmail.com wrote:
On Mon, 2015-02-02 at 11:27 +0100, Michal Kazior wrote:
While testing I've had my internal GRO patch for ath10k and no stretch
ack patches.
Thanks for the data, I took a look at it.
I am afraid this GRO patch might be
On 3 February 2015 at 00:06, Eric Dumazet eric.duma...@gmail.com wrote:
On Mon, 2015-02-02 at 13:25 -0800, Ben Greear wrote:
It is a big throughput win to have fewer TCP ack packets on
wireless since it is a half-duplex environment. Is there anything
we could improve so that we can have
On Tue, Feb 3, 2015 at 12:27 AM, David Lang da...@lang.hm wrote:
On Mon, 2 Feb 2015, Avery Pennarun wrote:
On Mon, Feb 2, 2015 at 11:44 AM, Björn Smedman b...@anyfi.net wrote:
We've got an SDN-inspired architecture with 802.11 frame tunneling (a
la CAPWAP), airtime fairness, infrastructure
On 02/02/2015 10:52 AM, Eric Dumazet wrote:
On Mon, 2015-02-02 at 11:27 +0100, Michal Kazior wrote:
While testing I've had my internal GRO patch for ath10k and no stretch
ack patches.
Thanks for the data, I took a look at it.
I am afraid this GRO patch might be the problem.
It seems
On Mon, Feb 2, 2015 at 11:44 AM, Björn Smedman b...@anyfi.net wrote:
On Mon, Feb 2, 2015 at 5:21 AM, Avery Pennarun apenw...@google.com wrote:
While there is definitely some work to be done in handoff, it seems
like there are some find implementations of this already in existence.
Several
On Mon, 2015-02-02 at 10:52 -0800, Eric Dumazet wrote:
It seems to break ACK clocking badly (linux stack has a somewhat buggy
tcp_tso_should_defer(), which relies on ACK being received smoothly, as
no timer is setup to split the TSO packet.)
Following patch might help the TSO split defer
On Mon, 2015-02-02 at 13:25 -0800, Ben Greear wrote:
It is a big throughput win to have fewer TCP ack packets on
wireless since it is a half-duplex environment. Is there anything
we could improve so that we can have fewer acks and still get
good tcp stack behaviour?
First apply TCP stretch
On 30 January 2015 at 15:40, Eric Dumazet eric.duma...@gmail.com wrote:
On Fri, 2015-01-30 at 14:39 +0100, Michal Kazior wrote:
I've briefly tried playing with this knob to no avail unfortunately. I
tried 256K, 1M - it didn't improve TCP performance. When I tried to
make it smaller (e.g. 16K)
On Sun, Feb 1, 2015 at 11:04 PM, Avery Pennarun apenw...@google.com wrote:
On Sun, Feb 1, 2015 at 6:34 PM, Andrew McGregor andrewm...@gmail.com wrote:
I missed one item in my list of potential improvements: the most braindead
thing 802.11 has to say about rates is that broadcast and multicast
On Mon, Feb 2, 2015 at 5:21 AM, Avery Pennarun apenw...@google.com wrote:
On Sun, Feb 1, 2015 at 9:43 AM, dpr...@reed.com wrote:
Just to clarify, managing queueing in a single access point WiFi network is
only a small part of the problem of fixing the rapidly degrading performance
of WiFi
On Mon, 2015-02-02 at 11:27 +0100, Michal Kazior wrote:
While testing I've had my internal GRO patch for ath10k and no stretch
ack patches.
Thanks for the data, I took a look at it.
I am afraid this GRO patch might be the problem.
It seems to break ACK clocking badly (linux stack has a
On Sun, Feb 1, 2015 at 6:34 PM, Andrew McGregor andrewm...@gmail.com wrote:
I missed one item in my list of potential improvements: the most braindead
thing 802.11 has to say about rates is that broadcast and multicast packets
should be sent at 'the lowest basic rate in the current supported
On Sun, Feb 1, 2015 at 9:43 AM, dpr...@reed.com wrote:
Just to clarify, managing queueing in a single access point WiFi network is
only a small part of the problem of fixing the rapidly degrading performance
of WiFi based systems.
Can you explain what you mean by rapidly degrading? The
On Sun, 1 Feb 2015, Avery Pennarun wrote:
On Sun, Feb 1, 2015 at 9:43 AM, dpr...@reed.com wrote:
I personally think that things like promoting semi-closed, essentially
proprietary ESSID-based bridged distribution systems as good ideas are
counterproductive to this goal. But that's perhaps
On Fri, 2015-01-30 at 14:47 +0100, Arend van Spriel wrote:
Indeed and that is what we would like to address in our wireless
drivers. I will setup some experiments using the fraction sizing and
post my findings. Again sorry if I offended you.
You did not, but I had no feedback about my
On Fri, 2015-01-30 at 14:39 +0100, Michal Kazior wrote:
I've briefly tried playing with this knob to no avail unfortunately. I
tried 256K, 1M - it didn't improve TCP performance. When I tried to
make it smaller (e.g. 16K) the traffic dropped even more so it does
have an effect. It seems
On 01/29/15 14:14, Eric Dumazet wrote:
On Thu, 2015-01-29 at 12:48 +0100, Michal Kazior wrote:
Hi,
I'm not subscribed to netdev list and I can't find the message-id so I
can't reply directly to the original thread `BW regression after tcp:
refine TSO autosizing`.
I've noticed a big TCP
Hi,
I'm not subscribed to netdev list and I can't find the message-id so I
can't reply directly to the original thread `BW regression after tcp:
refine TSO autosizing`.
I've noticed a big TCP performance drop with ath10k
(drivers/net/wireless/ath/ath10k) on 3.19-rc5. Instead of 500mbps I
get
On Thu, 2015-01-29 at 12:48 +0100, Michal Kazior wrote:
Hi,
I'm not subscribed to netdev list and I can't find the message-id so I
can't reply directly to the original thread `BW regression after tcp:
refine TSO autosizing`.
I've noticed a big TCP performance drop with ath10k
62 matches
Mail list logo