[PATCH net 3/3] RDS: TCP: fix race windows in send-path quiescence by rds_tcp_accept_one()

2016-06-04 Thread Sowmini Varadhan
nd testing for RDS_IN_XMIT after lock_sock could result in a deadlock with tcp_sendmsg, this commit fixes the race by using a new c_state, RDS_TCP_RESETTING, which will prevent a transition to RDS_CONN_UP from rds_tcp_state_change(). Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com> --

[PATCH net 0/3] RDS: TCP: socket locking RDS packet assembly fixes

2016-06-04 Thread Sowmini Varadhan
sent RDS datagrams will get retransmitted after rds_tcp_accept_one() switches sockets. Patch 3 fixes a race window which would prematurely re-enable rds_send_xmit() before the rds_tcp_connection setup has been completed in rds_tcp_accept_one(). Sowmini Varadhan (3): RDS: TCP: Add/use

[PATCH net 1/3] RDS: TCP: Add/use rds_tcp_reset_callbacks to reset tcp socket safely

2016-06-04 Thread Sowmini Varadhan
(for old and new sockets) when resetting existing callbacks. Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com> --- net/rds/tcp.c| 65 +++-- net/rds/tcp.h|1 + net/rds/tcp_listen.c | 13 +++--- 3 files chang

[PATCH net 2/3] RDS: TCP: Retransmit half-sent datagrams when switching sockets in rds_tcp_reset_callbacks

2016-06-04 Thread Sowmini Varadhan
When we switch a connection's sockets in rds_tcp_rest_callbacks, any partially sent datagram must be retransmitted on the new socket so that the receiver can correctly reassmble the RDS datagram. Use rds_send_reset() which is designed for this purpose. Signed-off-by: Sowmini Varadhan

latest net-next does not build on sparc?

2016-06-03 Thread Sowmini Varadhan
/hypermail/linux/kernel/1602.3/04431.html to get it to build- did I just miss something in my pull? --Sowmini

Re: IPv6 extension header privileges

2016-05-27 Thread Sowmini Varadhan
Are you suggesting some type of AAA to vet the hbh option itself? --Sowmini

Re: IPv6 extension header privileges

2016-05-22 Thread Sowmini Varadhan
l APIs don't > >>>>> allow setting EH. I would like to avoid that :-) > On 21.05.2016 19:46, Sowmini Varadhan wrote: > > Do you mean this > > http://www.ietf.org/mail-archive/web/spud/current/msg00365.html On (05/22/16 03:08), Hannes Frederic Sowa wrote: > Hmm,

Re: IPv6 extension header privileges

2016-05-21 Thread Sowmini Varadhan
is is something IETF > people probably know more about what an impact this change could have. I think it would be ok for some options to be privileged, others to not. We certainly have precedent for that from the classic socket APIs.. and man pages etc can document any restrictions, as we do with other ioctls/sockopts etc. --Sowmini

Re: IPv6 extension header privileges

2016-05-21 Thread Sowmini Varadhan
the networking separation we can't make it simply > non-priv. sure, I agree with that point. --Sowmini

Re: IPv6 extension header privileges

2016-05-20 Thread Sowmini Varadhan
parsing overhead and thus decided to > block it for ordinary users. That's probably more likely, esp for hbh options. It may also be interesting to find out what BSD does in these cases. --Sowmini

[PATCH net 0/2] RDS: TCP: connection spamming fixes

2016-05-18 Thread Sowmini Varadhan
TCP connection resets due to re-execution of the connection arbitration logic. Sowmini Varadhan (2): RDS: TCP: rds_tcp_accept_worker() must exit gracefully when terminating rds-tcp RDS: TCP: Avoid rds connection churn from rogue SYNs net/rds/tcp_listen.c | 13 + 1 files

[PATCH net 2/2] RDS: TCP: Avoid rds connection churn from rogue SYNs

2016-05-18 Thread Sowmini Varadhan
When a rogue SYN is received after the connection arbitration algorithm has converged, the incoming SYN should not needlessly quiesce the transmit path, and it should not result in needless TCP connection resets due to re-execution of the connection arbitration logic. Signed-off-by: Sowmini

[PATCH net 1/2] RDS: TCP: rds_tcp_accept_worker() must exit gracefully when terminating rds-tcp

2016-05-18 Thread Sowmini Varadhan
int will encounter a null rds_tcp_listen_sock, and must exit gracefully to allow the RDS-TCP termination to complete successfully. Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com> --- net/rds/tcp_listen.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git

[PATCH net-next] skbuff: remove unused variable `doff'

2016-05-10 Thread Sowmini Varadhan
x.net> Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com> --- net/core/skbuff.c |6 -- 1 files changed, 0 insertions(+), 6 deletions(-) diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 5586be9..f2b77e5 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -46

Re: A couple of questions about the SKB fragments

2016-05-05 Thread Sowmini Varadhan
's in the skb_shared_info. The len and data_len should be the sum-total for the whole skb, including skb_frag_t's and ->frag_list. --Sowmini

[PATCH net v2 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock

2016-05-02 Thread Sowmini Varadhan
cleared in the conn->c_flags indicating that any threads in rds_tcp_xmit are done. Fixes: 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an outgoing socket in rds_tcp_accept_one()") Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com> Acked-by: Santosh Shilimkar

[PATCH net v2 0/2] RDS: TCP: sychronization during connection startup

2016-05-02 Thread Sowmini Varadhan
threads in rds_tcp_xmit have completed (otherwise a null-ptr deref may be encountered). Patch 2 synchronizes rds_tcp_accept_one() with the rds_tcp*connect() path. v2: review comments from Santosh Shilimkar, other spelling corrections Sowmini Varadhan (2): RDS:TCP: Synchronize

[PATCH net v2 2/2] RDS: TCP: Synchronize accept() and connect() paths on t_conn_lock.

2016-05-02 Thread Sowmini Varadhan
rbitration logic) are UP (i.e., outgoing SYN was SYN-ACKed by peer after it sent us the SYN) or CONNECTING (we sent outgoing SYN before we saw incoming SYN). Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilim...@oracle.com> --- v2

Re: [PATCH net 2/2] RDS: TCP: Synchrnozize accept() and connect() paths on t_conn_lock.

2016-05-02 Thread Sowmini Varadhan
> rds_conn_transition(conn, RDS_CONN_DOWN, RDS_CONN_CONNECTING); > Like patch 1/2, probably we can leverage return value of above. : > You probably don't need the local 'conn_state' and below should work. > if (!rds_conn_connecting(conn) && !rds_conn_up(conn)) see explanation for comment to 1/2. --Sowmini

Re: [PATCH net 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock

2016-05-02 Thread Sowmini Varadhan
ctice, its not frequent to have syns crossing each other at "almost the same time". --Sowmini

[PATCH net 0/2] RDS: TCP: sychronization during connection startup

2016-05-01 Thread Sowmini Varadhan
any threads in rds_tcp_xmit have completed (otherwise a null-ptr deref may be encountered). Patch 2 synchronizes rds_tcp_accept_one() with the rds_tcp*connect() path. Sowmini Varadhan (2): RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock RDS: TCP: Synchrnozize

[PATCH net 2/2] RDS: TCP: Synchrnozize accept() and connect() paths on t_conn_lock.

2016-05-01 Thread Sowmini Varadhan
rbitration logic) are UP (i.e., outgoing SYN was SYN-ACKed by peer after it sent us the SYN) or CONNECTING (we sent outgoing SYN before we saw incoming SYN). Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com> --- net/rds/tcp.c |1 + net/rds/tcp.h |

[PATCH net 1/2] RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock

2016-05-01 Thread Sowmini Varadhan
cleared in the conn->c_flags indicating that any threads in rds_tcp_xmit are done. Fixes: 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an outgoing socket in rds_tcp_accept_one()") Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com> --- net/rds/tcp.c

[PATCH v2 net-next 2/2] RDS: TCP: Call pskb_extract() helper function

2016-04-22 Thread Sowmini Varadhan
rds-stress experiments with request size 256 bytes, 8K acks, using 16 threads show a 40% improvment when pskb_extract() replaces the {skb_clone(..); pskb_pull(..); pskb_trim(..);} pattern in the Rx path, so we leverage the perf gain with this commit. Signed-off-by: Sowmini Varadhan <sowmini.va

[PATCH v2 net-next 0/2] pskb_extract() helper function.

2016-04-22 Thread Sowmini Varadhan
for Rx on the PF_RDS socket. Patch 1 of this patchset adds a pskb_extract() function that does all this without the redundant memcpy's in pskb_expand_head() and __pskb_pull_tail(). v2: Marcelo Leitner review comments Sowmini Varadhan (2): Add pskb_extract() helper function Call

[PATCH v2 net-next 1/2] skbuff: Add pskb_extract() helper function

2016-04-22 Thread Sowmini Varadhan
rim() is then invoked to trim clone down to the requested to_copy bytes. Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com> --- v2: Marcelo Leitner review comments include/linux/skbuff.h |2 + net/core/skbuff.c | 242 2 files cha

Re: [PATCH net-next 0/2] pskb_extract() helper function.

2016-04-22 Thread Sowmini Varadhan
r RDS-TCP, the rx side does show a noticeable benefit with rds-stress. To an extent, this is also impacted by the packet size, and the type of test (for our DB workloads, we use request-response tests, and the packet size tends to typically be 8K req, 256 byt responses), so I guess ymmv. --Sowmini

[PATCH net-next 0/2] pskb_extract() helper function.

2016-04-20 Thread Sowmini Varadhan
for Rx on the PF_RDS socket. Patch 1 of this patchset adds a pskb_extract() function that does all this without the redundant memcpy's in pskb_expand_head() and __pskb_pull_tail(). Sowmini Varadhan (2): Add pskb_extract() helper function Call pskb_extract() helper function include/linux

[PATCH net-next 2/2] RDS: TCP: Call pskb_extract() helper function

2016-04-20 Thread Sowmini Varadhan
rds-stress experiments with request size 256 bytes, 8K acks, using 16 threads show a 40% improvment when pskb_extract() replaces the {skb_clone(..); pskb_pull(..); pskb_trim(..);} pattern in the Rx path, so we leverage the perf gain with this commit. Signed-off-by: Sowmini Varadhan <sowmini.va

[PATCH net-next 1/2] skbuff: Add pskb_extract() helper function

2016-04-20 Thread Sowmini Varadhan
rim() is then invoked to trim clone down to the requested to_copy bytes. Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com> --- include/linux/skbuff.h |2 + net/core/skbuff.c | 248 2 files changed, 250 insertions(+), 0 deletions(-) d

[PATCH RFC net-next 1/2] skbuff: Add pskb_extract() helper function

2016-04-18 Thread Sowmini Varadhan
rim() is then invoked to trim clone down to the requested to_copy bytes. Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com> --- include/linux/skbuff.h |2 + net/core/skbuff.c | 248 2 files changed, 250 insertions(+), 0 deletions(-) d

[PATCH RFC net-next 2/2] RDS: TCP: Call pskb_extract() helper function

2016-04-18 Thread Sowmini Varadhan
rds-stress experiments with request size 256 bytes, 8K acks, using 16 threads show a 40% improvment when pskb_extract() replaces the {skb_clone(..); pskb_pull(..); pskb_trim(..);} pattern in the Rx path, so we leverage the perf gain with this commit. Signed-off-by: Sowmini Varadhan <sowmini.va

[PATCH RFC net-next 0/2] pskb_extract() helper function.

2016-04-18 Thread Sowmini Varadhan
the needless copy of trailer frags/pages that will then get trimmed away. I am deferring that optimization for the next iteration, and would like to get feedback on this first pass, which by itself gives a noticeable perf boost. Sowmini Varadhan (2): Add pskb_extract() helper function Call

Re: optimizations to sk_buff handling in rds_tcp_data_ready

2016-04-09 Thread Sowmini Varadhan
list cases. I'm afraid I might end up needing most of the stuff under the "Pure masohism" (sic) comment in __pskb_pull_tail(). --Sowmini

Re: optimizations to sk_buff handling in rds_tcp_data_ready

2016-04-07 Thread Sowmini Varadhan
On (04/07/16 07:16), Eric Dumazet wrote: > Use skb split like TCP in output path ? That almost looks like what I want, but skb_split modifies both skb and skb1, and I want to leave skb untouched (otherwise I will mess up the book-keeping in tcp_read_sock). But skb_split is a good template- I

optimizations to sk_buff handling in rds_tcp_data_ready

2016-04-07 Thread Sowmini Varadhan
oid skb_copy_bits() above, and to pass delta to pskb_expand_head, - in pskb_expand_head, only do the memcpy listed above if delta <= 0 Any other ideas for how to achieve this? --Sowmini

Re: [net PATCH v2] i40e/i40evf: Limit TSO to 7 descriptors for payload instead of 8 per packet

2016-03-30 Thread Sowmini Varadhan
Yes, this fixes it for me too! Tested-by: Sowmini Varadhan <sowmini.varad...@oracle.com>

Re: [net PATCH] i40e/i40evf: Limit TSO to 7 descriptors for payload instead of 8 per packet

2016-03-30 Thread Sowmini Varadhan
ket that you see the issue. I dont know if that provides any clues. --Sowmini

Re: [net PATCH] i40e/i40evf: Limit TSO to 7 descriptors for payload instead of 8 per packet

2016-03-30 Thread Sowmini Varadhan
dpage path that i40e does not anticipate. Other drivers (ixgbe etc) work fine, so my hunch would be that this is specific to i40e (and not a skb_linearize bug) but I could be wrong. --Sowmini

Re: [net PATCH] i40e/i40evf: Limit TSO to 7 descriptors for payload instead of 8 per packet

2016-03-30 Thread Sowmini Varadhan
e if that changed info is significant? --Sowmini

Re: [net PATCH] i40e/i40evf: Limit TSO to 7 descriptors for payload instead of 8 per packet

2016-03-30 Thread Sowmini Varadhan
out with rds-stress on my test-pair, unfortunately, I still see the Tx hang. Setting up the test is quite easy- for reference, the instructions are here: https://sourceforge.net/p/e1000/mailman/message/34936766/ --Sowmini

[PATCH net-next v4 2/2] RDS: TCP: Remove unused constant

2016-03-20 Thread Sowmini Varadhan
RDS_TCP_DEFAULT_BUFSIZE has been unused since commit 1edd6a14d24f ("RDS-TCP: Do not bloat sndbuf/rcvbuf in rds_tcp_tune"). Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com> --- v3: review comments from Santosh Shilimkar net/rds/tcp.c |2 -- 1 files changed, 0

Re: [PATCH net-next v3 1/2] RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket

2016-03-19 Thread Sowmini Varadhan
bably doing this very rarely, and for a good reason, and is fully aware of the cost. So there is some degree of human control possible. --Sowmini

[PATCH net-next v4 1/2] RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket

2016-03-19 Thread Sowmini Varadhan
of existing rds-tcp sockets when the tunable is modified. To make sure that all connections in the netns pick up the same value for the tunable, we reset existing rds-tcp connections in the netns, so that they can reconnect with the new parameters. Signed-off-by: Sowmini Varadhan <sowmini.va

Re: [PATCH V3 net-next] ixgbe: Avoid unaligned access in ixgbe_atr() for LLC packets

2016-03-19 Thread Sowmini Varadhan
e various lists yesterday, so he could have just told me about this. But, whatever. --sowmini

[PATCH net-next v4 0/2] RDS: TCP: tunable socket buffer parameters

2016-03-19 Thread Sowmini Varadhan
Patch 1 uses sysctl to create tunable socket buffer size parameters. Patch 2 removes an unuused constant. v2: use sysctl v3: review comments from Santosh Shilimkar, Eric Dumazet v4: review comments from Hannes Sowa Sowmini Varadhan (2): RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds

Re: [PATCH net-next v3 1/2] RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket

2016-03-16 Thread Sowmini Varadhan
se I need to rds_tcp_sysctl_reset() existing connections to make them use the new tunable. --Sowmini

Re: [PATCH net-next v3 1/2] RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket

2016-03-16 Thread Sowmini Varadhan
do this. I needed to do the copy_from_user() + kstrtoul, and user_atoi had some additional checks that seemed useful. --Sowmini

Re: [PATCH v2 net-next] rds-tcp: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket

2016-03-15 Thread Sowmini Varadhan
enting a design extension to the daemon approach in the future. By itself, the sysctl support adds value and can co-exist with those extensions. --Sowmini

[PATCH net-next v3 1/2] RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket

2016-03-15 Thread Sowmini Varadhan
of existing rds-tcp sockets when the tunable is modified. To make sure that all connections in the netns pick up the same value for the tunable, we reset existing rds-tcp connections in the netns, so that they can reconnect with the new parameters. Signed-off-by: Sowmini Varadhan <sowmini.va

[PATCH net-next v3 0/2] RDS: TCP: tunable socket buffer parameters

2016-03-15 Thread Sowmini Varadhan
Patch 1 uses sysctl to create tunable socket buffer size parameters. Patch 2 removes an unuused constant. Changes since v2: review comments from Santosh Shilimkar, Eric Dumazet Sowmini Varadhan (2): RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket RDS: TCP: Remove unused

[PATCH net-next v3 2/2] RDS: TCP: Remove unused constant

2016-03-15 Thread Sowmini Varadhan
RDS_TCP_DEFAULT_BUFSIZE has been unused since commit 1edd6a14d24f ("RDS-TCP: Do not bloat sndbuf/rcvbuf in rds_tcp_tune"). Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com> --- v3: review comments from Santosh Shilimkar net/rds/tcp.c |2 -- 1 files changed, 0

Re: [PATCH v2 net-next] rds-tcp: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket

2016-03-15 Thread Sowmini Varadhan
uite unpredictable (as you point out, depends on the kernel version among other things). > But again if your sysctl allows to set a value below SOCK_MIN_SNDBUF, > that might be a problem, because stack could have a hidden bug for very > small values of sndbuf/rcvbuf. sure, fixing/testing it as I write this. --Sowmini

Re: [PATCH v2 net-next] rds-tcp: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket

2016-03-15 Thread Sowmini Varadhan
amespace are at least as important configuration > parameters as these two. Those parameters can be added via sysctl if needed- we have not seen the need for that yet. And if something comes up that needs the netlink etc. we can add it down the road. It's just not needed now. Thanks --Sowmini >

Re: [PATCH v2 net-next] rds-tcp: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket

2016-03-15 Thread Sowmini Varadhan
the buffer size to 4608 (as reported by getsockopt in my env. I think the getsockopt value is impacted by many factors). --Sowmini

Re: [PATCH v2 net-next] rds-tcp: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket

2016-03-15 Thread Sowmini Varadhan
On (03/15/16 09:38), santosh shilimkar wrote: > >+if (rtn->sndbuf_size > 0) { > So value of 1 is allowed as well. There should be some > minimum default or multiple of it. Of course above check > can remain as is as long as you validate the user input > in handlers. yes, just as user-space

[PATCH v2 net-next] rds-tcp: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket

2016-03-15 Thread Sowmini Varadhan
of existing rds-tcp sockets when the tunable is modified. To make sure that all connections in the netns pick up the same value for the tunable, we reset existing rds-tcp connections in the netns, so that they can reconnect with the new parameters. Signed-off-by: Sowmini Varadhan <sowmini.va

[PATCH v2 net-next] ixgbe: Avoid unaligned access in ixgbe_atr() for LLC packets

2016-03-14 Thread Sowmini Varadhan
are not likely to correctly find IPVERSION, or "6", at hdr.ipv4->version, but will instead just needlessly trigger an unaligned access. (IPv4/IPv6 over LLC is almost never implemented). The unaligned access is thus avoidable: bail out quickly after examining first->protocol. Signed-off-by:

Re: [PATCH net-next] rds-tcp: Add module parameters to control sndbuf/rcvbuf size of RDS-TCP socket

2016-03-14 Thread Sowmini Varadhan
send out the patches later this week after some more cleanup and testing. --Sowmini However, it would still be nice to know exactly what distribution issues come out of modparam.

Re: [PATCH v2 net-next] ixgbe: Avoid unaligned access in ixgbe_atr() for LLC packets

2016-03-14 Thread Sowmini Varadhan
wn the stack with the NET_IP_ALIGNment and (b) ARP is only sent over Ethernet II (there is no LLC SAP for ARP, which is a big reason why ipv4 is not sent over llc, despite rfc 1042). I figured it would not hurt to pass it down, in case we decide to do something clever with it in the future. --Sowmini

[PATCH net-next] ixgbe: Avoid unaligned access in ixgbe_atr() for LLC packets

2016-03-14 Thread Sowmini Varadhan
are not likely to correctly find IPVERSION, or "6", at hdr.ipv4->version, but will instead just needlessly trigger an unaligned access. (IPv4/IPv6 over LLC is almost never implemented). The unaligned access is thus avoidable: bail out quickly after examining skb->protocol. Signed-off-by:

Re: [PATCH net-next] rds-tcp: Add module parameters to control sndbuf/rcvbuf size of RDS-TCP socket

2016-03-11 Thread Sowmini Varadhan
a problem to implement at all, if it ever shows up as a requirement for customers. --Sowmini

Re: [PATCH net-next] rds-tcp: Add module parameters to control sndbuf/rcvbuf size of RDS-TCP socket

2016-03-11 Thread Sowmini Varadhan
far simpler to just tell the cluster-setup scripts to just refrain from an ifup of the relevant interfaces till all the config is set up. Besides, the basic problem remains: for an arbitrary kernel module that has parameters that need to be customized before module init, what are the options today? --Sowmini

Re: [PATCH net-next] rds-tcp: Add module parameters to control sndbuf/rcvbuf size of RDS-TCP socket

2016-03-11 Thread Sowmini Varadhan
hings to get to per-net vars. I dont know if there is a better alternative than sysctl/module_param for achieving what I'm trying to do, which is to set up the params for the kernel sockets before creating them. If yes, some hints/rtfms would be helpful. --Sowmini

Re: [PATCH net-next] rds-tcp: Add module parameters to control sndbuf/rcvbuf size of RDS-TCP socket

2016-03-11 Thread Sowmini Varadhan
On (03/11/16 11:09), Stephen Hemminger wrote: > > Module parameters are a problem for distributions and should only be used > as a last resort. I was not aware of that- out of curiosity, what is the associated problem? What would be the alternative recommendation in this case? --Sowmini

[PATCH net-next] rds-tcp: Add module parameters to control sndbuf/rcvbuf size of RDS-TCP socket

2016-03-11 Thread Sowmini Varadhan
Some payload sizes/patterns stand to gain performance benefits by tuning the size of the TCP socket buffers, so this commit adds module parameters to customize those values when desired. Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com> --- net/rds/tcp.c | 16 +

Re: [PATCH v2 net-next 11/13] kcm: Add memory limit for receive message construction

2016-03-07 Thread Sowmini Varadhan
onable constraint, but is it possible to end up with a deadlocked TCP (recv) socket- one for which the receiver closed the window (so sender TCP cannot send the remaining bytes of the kcm message), but cannot be drained because of #3 above? BTW there are a couple of typos above; s/skbuf/skbuff s/incluing/including --Sowmini

Re: net-next build failure due to 16e5cc64?

2016-03-01 Thread Sowmini Varadhan
at no one else has noticed this. It did not take me any effort to reproduce it on an x86. --Sowmini

Re: net-next build failure due to 16e5cc64?

2016-03-01 Thread Sowmini Varadhan
what the compiler is flagging. --Sowmini

Re: net-next build failure due to 16e5cc64?

2016-03-01 Thread Sowmini Varadhan
struct tc_to_netdev tc; : > + tc.type = TC_SETUP_MQPRIO; The compiler error is for fields within the union which lacks both a tag and a union-name. So I'm not sure how the above will help. --Sowmini

net-next build failure due to 16e5cc64?

2016-03-01 Thread Sowmini Varadhan
John, I'm getting a failure when I try to build the latest net-next. I see net/sched/sch_mqprio.c: In function `mqprio_init': net/sched/sch_mqprio.c:145: error: unknown field `tc' specified in initializer net/sched/sch_mqprio.c:145: warning: missing braces around initializer

Re: Invalid sk_policy[] access

2016-02-23 Thread Sowmini Varadhan
ck 1216 sizeof request_sock 312 sizeof inet_request_sock 328 offsetof sk_policy 1 520 So it's good to know that crash does not lie. But then it's odd that the struct sizes (esp of things like request_sock, which are not config dependant) are not the same. --Sowmini

Re: Invalid sk_policy[] access

2016-02-23 Thread Sowmini Varadhan
Hat 4.4.7-4)" But my question from the email remains. Unless I am missing something subtle in the code, a struct request_sock and a struct sock only have the sock_common part in common. So casting a request_sock as a struct sock may have issues? --Sowmini

Re: Invalid sk_policy[] access

2016-02-23 Thread Sowmini Varadhan
intk's of the structures in the case of the v440. And they *are* different, and the numbers match the values dumped on the console on pnaic. So isnt there actually a problem here? --Sowmini

Re: Invalid sk_policy[] access (was Re: Recent spontaneous reboots on multiple machines)

2016-02-23 Thread Sowmini Varadhan
in the > console. Maybe you're seeing something else then. --Sowmini

Invalid sk_policy[] access (was Re: Recent spontaneous reboots on multiple machines)

2016-02-23 Thread Sowmini Varadhan
ds you are looking for? so how is this supposed to work? (Evidently it worked for Meelis before, but I dont know if that was before or after commit 9e17f8a475). --Sowmini

Re: [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers

2016-02-03 Thread Sowmini Varadhan
ight, but there is a big difference between a performance degradation > and a hard failure. It would at least be nice to know what the > performance hit actually is, if it's acceptable then this would be a > far simpler and much less invasive fix than the alternatives. --Sowmini

Re: [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers

2016-02-03 Thread Sowmini Varadhan
On (02/03/16 09:07), Tom Herbert wrote: > > Kernel unaligned access at TPC[9150dc] ipv4_neigh_lookup+0x38/0x170 > > Sowmini, > > This doesn't look like a hard crash to me. Instead of trying to fix > all the alignment issues for Sparc, can we just take the trap, fix up >

[PATCH net-next 2/2] sunvnet: perf tracepoint invocations to trace LDC state machine

2016-02-02 Thread Sowmini Varadhan
Use sunvnet perf trace macros to monitor LDC message exchange state. Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com> --- drivers/net/ethernet/sun/sunvnet.c | 24 ++-- 1 files changed, 22 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethern

[PATCH net-next 1/2] sunvnet: Add support for perf LDC event tracing

2016-02-02 Thread Sowmini Varadhan
Add perf event macros for support of tracing and instrumentation of LDC state machine Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com> --- include/trace/events/sunvnet.h | 139 1 files changed, 139 insertions(+), 0 deletions(-) creat

[PATCH net-next 0/2] sunvnet: perf tracepoint hooks

2016-02-02 Thread Sowmini Varadhan
Added some perf tracepoints to help track and debug sunvnet descriptor state at run-time. Sowmini Varadhan (2): sunvnet: Add support for perf LDC event tracing sunvnet: perf tracepoint invocations to trace LDC state machine drivers/net/ethernet/sun/sunvnet.c | 24 ++- include/trace

Re: [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers

2016-02-02 Thread Sowmini Varadhan
%g1 I get unaligned access traps at __skb_flow_dissect+500 and __skb_flow_dissect+512 (corresponding to saddr and daddr), once for each interface (gretap/eth0 and eth1). --Sowmini

Re: [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers

2016-02-01 Thread Sowmini Varadhan
On (01/31/16 13:37), Tom Herbert wrote: > > Call get_unaligned_be32 when we access 32-bit fields in > __skb_flow_dissect. At the beginning check for unlikely case of > 1-byte aligned packet. > > Note that flow_dissector may be asked to parse packet unaligned > fields in two instances: I tested

Re: [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers

2016-02-01 Thread Sowmini Varadhan
On (02/01/16 16:20), Nicolas Dichtel wrote: > There is also the tile architecture, up to 76 cores on the board I've seen. It > requires an alignment on 8! > I wonder how this case may be properly handled. A simple ipip scenario fails. Yes, I'm also observing the same thing. Simply setting up

Re: [net PATCH] flow_dissector: Fix unaligned access in __skb_flow_dissector when used by eth_get_headlen

2016-01-31 Thread Sowmini Varadhan
with LLC/SNAP etc. And in addition to all this, anything we can do to help align nested encapsulations (i.e., Tom's concern) is also needed, and maybe it will be cleaner to do that if we just split off deep packet inspection for flow extraction away from eth_get_headlen. --Sowmini

Re: [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers

2016-01-31 Thread Sowmini Varadhan
On (01/31/16 16:24), Eric Dumazet wrote: > > But this test is absolutely useless, what about testing arches that > actually care ? > Yes, I plan to help test this out tomorrow, Tom suggested setting up gre-teb between x86 and sparc. --Sowmini

Re: [net PATCH] flow_dissector: Fix unaligned access in __skb_flow_dissector when used by eth_get_headlen

2016-01-30 Thread Sowmini Varadhan
On (01/30/16 09:43), Tom Herbert wrote: > That is not the only case, If a GRE TEB packet is ever received and > flow dissector is called for any reason (like skb_get_hash) there's > going to be problems-- and this doesn't require GRE to even be > configured on the host. > > I have a patch that

Re: [net PATCH] flow_dissector: Fix unaligned access in __skb_flow_dissector when used by eth_get_headlen

2016-01-30 Thread Sowmini Varadhan
On (01/29/16 19:23), Eric Dumazet wrote: > BTW, even a memcpy(_addrs->v4addrs, >saddr, 8) could crash, as > the compiler can certainly assume src and dst are 4 bytes aligned, and > could use word accesses when inlining memcpy() even on Sparc. > > Apparently the compiler used

[RFC] Kernel unaligned access at __skb_flow_dissect

2016-01-29 Thread Sowmini Varadhan
There is an unaligned access at __skb_flow_dissect when it calls ip6_flowlabel() with the call stack __skb_flow_dissect+0x2a8/0x87c eth_get_headlen+0x5c/0xaxa4 ixgbe_clean_rx_irq+0x5cc/0xb20 [ixgbe] ixgbe_poll+0x5a4/0x760 [ixgbe] net_rx_action+0x13c/0x354 : Essentially,

Re: [RFC] Kernel unaligned access at __skb_flow_dissect

2016-01-29 Thread Sowmini Varadhan
th memcpy (as was happening with the saddr ref in skb_flow_dissect that puzzled me and Eric because it did not generate any traps). I suppose one could sprinkele a few WARN_ON's for !IS_ALIGNED but that's not a fool-proof detection method either (in addition to all the ugly shouting in the code). --Sowmini

Re: [RFC] Kernel unaligned access at __skb_flow_dissect

2016-01-29 Thread Sowmini Varadhan
On (01/29/16 15:31), Tom Herbert wrote: > But even within flow dissector, to be completely correct, we need to > replace all 32-bit accesses with the mempy (flow_label, mpls label, > keyid) and be vigilant about new ones coming in. For that matter, .. well, one question that came to my mind when

Re: [RFC] Kernel unaligned access at __skb_flow_dissect

2016-01-29 Thread Sowmini Varadhan
ting me, I dont have any l2 gre invovled (simple p2p Ethernet II + ipv6 + tcp, no vlans, gre or other exotic stuff). --Sowmini

Re: [RFC] Kernel unaligned access at __skb_flow_dissect

2016-01-29 Thread Sowmini Varadhan
On (01/29/16 11:37), Eric Dumazet wrote: > > I have no idea why reading iph->saddr or iph->daddr would not hit the > problem, but accessing the 32bit ipv6 flow label would be an issue. > > Something is fishy. I was wondering about this myself. Even on sparc, I only first ran into the errors for

Re: [RFC] Kernel unaligned access at __skb_flow_dissect

2016-01-29 Thread Sowmini Varadhan
plains about unaligned access in most cases, some sins may pass under the radar, and other platforms dont even generate traps, so it's easy to not know that there's a problem, a lot of the time. --Sowmini

Re: [PATCH v2] net: Add eth_platform_get_mac_address() helper.

2016-01-06 Thread Sowmini Varadhan
On (01/06/16 16:33), David Miller wrote: > > Signed-off-by: David S. Miller <da...@davemloft.net> Acked-by: Sowmini Varadhan <sowmini.varad...@oracle.com> > > As promised back in November, I'm commiting this into net-next > now. I'd work on the conversions o

Re: [PATCH net-next] ixgbe: Fix for RAR0 not being set to default MAC addr

2016-01-06 Thread Sowmini Varadhan
esn't have destination mac address equals to FF:FF:FF:FF:FF:FF. > > This commit sets RAR0 correctly to default HW mac address. > > Signed-off-by: Tushar Dave <tushar.n.d...@oracle.com> Tested-by: Sowmini Varadhan <sowmini.varad...@oracle.com> -- To unsubscribe fro

Re: Q: bad routing table cache entries

2016-01-03 Thread Sowmini Varadhan
On (12/30/15 15:42), Stas Sergeev wrote: > 29.12.2015 18:22, Sowmini Varadhan пишет: > > Do you have admin control over the ubuntu router? > > If yes, you might want to check the shared_media [#] setting > > on that router for the interfaces with overlapping subnets. > &

Re: Q: bad routing table cache entries

2015-12-29 Thread Sowmini Varadhan
IPSKB_DOREDIRECT if shared_media is turned off. --Sowmini [#] https://www.frozentux.net/ipsysctl-tutorial/chunkyhtml/theconfvariables.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo inf

Re: Q: bad routing table cache entries

2015-12-29 Thread Sowmini Varadhan
irect. You would have to check into the configuration and/or implementation of the router- it should not be sending back a redirect in the above case (different netmasks) even if the ingress and egress physical interfaces are the same. --Sowmini -- To unsubscribe from this list: send the line "unsubscri

Re: Q: bad routing table cache entries

2015-12-29 Thread Sowmini Varadhan
I suppose it might not hurt if the receiver can do some sanity checking on the redirect but this might not eliminate every error, since it might not be possible to detect netmask mismatch in every case. --Sowmini -- To unsubscribe from this list: send the line "unsubscribe netdev"

<    1   2   3   4   5   6   7   >