Re: [Patch net-next] call sk_dst_reset when set SO_DONTROUTE

2018-12-05 Thread peng yu
In fack, my customer's issue is that he set SO_DONTROUTE by mistake.
He shouldn't do that. But after he set this flag, the connection has
no problem at first. After the sk_dst_cache expired for some reasons,
the connection stucked. I think the correct behavior is that the
connection should stuck immediately after set SO_DONTROUTE to 1.
On Wed, Dec 5, 2018 at 4:20 PM Eric Dumazet  wrote:
>
>
>
> On 12/05/2018 04:13 PM, peng yu wrote:
> > The SO_DONTROUTE doesn't impact the TCP receiving path, but it should
> > block the ACK of the receiving packet. When there are too many packets
> > which are not ACKed, the client will stop to send packets, so the
> > sock.recv on the server side won't receive data after it received some
> > data. I extracted the test code from my customer's production
> > environment. The test code could reproduce the issue but it is not a
> > good example. I will rewrite a test code and re-submit the patch.
>
> Now I fully understand ;)
>
> Basically your customer is using SO_DONTROUTE to 'pause' incoming TCP traffic
> by not sending ACK.
>
> Interesting trick but quite hacky. I guess that sending ACK with 0 window
> would be less intrusive.
>
> > Wed, Dec 5, 2018 at 3:17 PM Eric Dumazet  wrote:
> >>
> >> On Wed, Dec 5, 2018 at 3:07 PM yupeng  wrote:
> >>>
> >>> after set SO_DONTROUTE to 1, the IP layer should not route packets if
> >>> the dest IP address is not in link scope. But if the socket has cached
> >>> the dst_entry, such packets would be routed until the sk_dst_cache
> >>> expires. So we should clean the sk_dst_cache when a user set
> >>> SO_DONTROUTE option. Below are server/client python scripts which
> >>> could reprodue this issue:
> >>>
> >>> server side code:
> >>> ==
> >>> import socket
> >>> import struct
> >>>
> >>> s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
> >>> s.bind(('0.0.0.0', 9000))
> >>> s.listen(1)
> >>> sock, addr = s.accept()
> >>> sock.setsockopt(socket.SOL_SOCKET, socket.SO_DONTROUTE, struct.pack('i', 
> >>> 1))
> >>> while True:
> >>> data = sock.recv(1024) # here the sock.recv should not return anything
> >>
> >> Why is that so ?
> >>
> >> What is the relation of input path with the SO_DONTROUTE which is for TX ?
> >>
> >> sk_dst_reset(sk) should not impact receive side ?
> >>
> >> Thanks for providing a test !
> >>
> >>> print(data)
> >>> ==
> >>>
> >>> client side code:
> >>> ==
> >>> import socket
> >>> import time
> >>>
> >>> s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
> >>> s.connect(('server_address', 9000))
> >>> while True:
> >>> s.send(b'foo')
> >>> print('send foo')
> >>> time.sleep(1)
> >>> ==
> >>>
> >>> Signed-off-by: yupeng 
> >>> ---
> >>>  net/core/sock.c | 1 +
> >>>  1 file changed, 1 insertion(+)
> >>>
> >>> diff --git a/net/core/sock.c b/net/core/sock.c
> >>> index f5bb89785e47..f00902c532cc 100644
> >>> --- a/net/core/sock.c
> >>> +++ b/net/core/sock.c
> >>> @@ -700,6 +700,7 @@ int sock_setsockopt(struct socket *sock, int level, 
> >>> int optname,
> >>> break;
> >>> case SO_DONTROUTE:
> >>> sock_valbool_flag(sk, SOCK_LOCALROUTE, valbool);
> >>> +   sk_dst_reset(sk);
> >>> break;
> >>> case SO_BROADCAST:
> >>> sock_valbool_flag(sk, SOCK_BROADCAST, valbool);
> >>> --
> >>> 2.17.1
> >>>


Re: [Patch net-next] call sk_dst_reset when set SO_DONTROUTE

2018-12-05 Thread peng yu
The SO_DONTROUTE doesn't impact the TCP receiving path, but it should
block the ACK of the receiving packet. When there are too many packets
which are not ACKed, the client will stop to send packets, so the
sock.recv on the server side won't receive data after it received some
data. I extracted the test code from my customer's production
environment. The test code could reproduce the issue but it is not a
good example. I will rewrite a test code and re-submit the patch.On
Wed, Dec 5, 2018 at 3:17 PM Eric Dumazet  wrote:
>
> On Wed, Dec 5, 2018 at 3:07 PM yupeng  wrote:
> >
> > after set SO_DONTROUTE to 1, the IP layer should not route packets if
> > the dest IP address is not in link scope. But if the socket has cached
> > the dst_entry, such packets would be routed until the sk_dst_cache
> > expires. So we should clean the sk_dst_cache when a user set
> > SO_DONTROUTE option. Below are server/client python scripts which
> > could reprodue this issue:
> >
> > server side code:
> > ==
> > import socket
> > import struct
> >
> > s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
> > s.bind(('0.0.0.0', 9000))
> > s.listen(1)
> > sock, addr = s.accept()
> > sock.setsockopt(socket.SOL_SOCKET, socket.SO_DONTROUTE, struct.pack('i', 1))
> > while True:
> > data = sock.recv(1024) # here the sock.recv should not return anything
>
> Why is that so ?
>
> What is the relation of input path with the SO_DONTROUTE which is for TX ?
>
> sk_dst_reset(sk) should not impact receive side ?
>
> Thanks for providing a test !
>
> > print(data)
> > ==
> >
> > client side code:
> > ==
> > import socket
> > import time
> >
> > s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
> > s.connect(('server_address', 9000))
> > while True:
> > s.send(b'foo')
> > print('send foo')
> > time.sleep(1)
> > ==
> >
> > Signed-off-by: yupeng 
> > ---
> >  net/core/sock.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/net/core/sock.c b/net/core/sock.c
> > index f5bb89785e47..f00902c532cc 100644
> > --- a/net/core/sock.c
> > +++ b/net/core/sock.c
> > @@ -700,6 +700,7 @@ int sock_setsockopt(struct socket *sock, int level, int 
> > optname,
> > break;
> > case SO_DONTROUTE:
> > sock_valbool_flag(sk, SOCK_LOCALROUTE, valbool);
> > +   sk_dst_reset(sk);
> > break;
> > case SO_BROADCAST:
> > sock_valbool_flag(sk, SOCK_BROADCAST, valbool);
> > --
> > 2.17.1
> >


Re: [PATCH net-next] add part of TCP counts explanations in snmp_counters.rst

2018-11-19 Thread peng yu
On Mon, Nov 19, 2018 at 10:51 AM Stephen Hemminger
 wrote:
>
> On Fri, 16 Nov 2018 11:17:40 -0800
> yupeng  wrote:
>
> > +* TcpInSegs
> > +Defined in `RFC1213 tcpInSegs`_
> > +
> > +.. _RFC1213 tcpInSegs: https://tools.ietf.org/html/rfc1213#page-48
> > +
> > +The number of packets received by the TCP layer. As mentioned in
> > +RFC1213, it includes the packets received in error, such as checksum
> > +error, invalid TCP header and so on. Only one error won't be included:
> > +if the layer 2 destination address is not the NIC's layer 2
> > +address. It might happen if the packet is a multicast or broadcast
> > +packet, or the NIC is in promiscuous mode. In these situations, the
> > +packets would be delivered to the TCP layer, but the TCP layer will discard
> > +these packets before increasing TcpInSegs. The TcpInSegs counter
> > +isn't aware of GRO. So if two packets are merged by GRO, the TcpInSegs
> > +counter would only increase 1.
>
> Is it it obvious that TCP which is L4 masks off all the other things
> that could happen at L3 and L2.  SO this text is correct but redundant.

You mentioned the text is redundant, I'm not sure which part you are
talking about.
If you are talking about the GRO part, here is my explanation: the
TcpInSegs isn't aware of GRO, but TcpOutSegs is aware of GSO, when
server A sends packets to server B, the TcpOutSegs on server A might
be much higher than the TcpInSegs on server B, so I think it is worth
to point it out.
If you are talking about the other part, please let me know.


Re: [PATCH net-next v2] documentation of some IP/ICMP snmp counters

2018-11-10 Thread peng yu
my changes depend on previous feedback:
1 group counters by protocol
2 remove all text which is pasted from RFC
3 about the simple command as 'ping', I hope to provide a full test
steps, so I still keep them in the document
4 use capital letter for the abbreviated keywords
5 tried to fix grammar mistakes, but I re-organized the document, lots
of sentences are changed, it may have new mistakes.
6 only provide IP/ICMP counters, no TCP counters, will provide TCP
counters after the IP/ICMP counters are accepted.
On Sat, Nov 10, 2018 at 1:38 PM yupeng  wrote:
>
> The snmp_counter.rst explains the meanings of snmp counters. It also
> provides a set of experiments (only 1 for this initial patch),
> combines the experiments' resutls and the snmp counters'
> meanings. This is an initial path, only explains a part of IP/ICMP
> counters and provide a simple ping test.
>
> Signed-off-by: yupeng 
> ---
>  Documentation/networking/index.rst|   1 +
>  Documentation/networking/snmp_counter.rst | 222 ++
>  2 files changed, 223 insertions(+)
>  create mode 100644 Documentation/networking/snmp_counter.rst
>
> diff --git a/Documentation/networking/index.rst 
> b/Documentation/networking/index.rst
> index bd89dae8d578..6a47629ef8ed 100644
> --- a/Documentation/networking/index.rst
> +++ b/Documentation/networking/index.rst
> @@ -31,6 +31,7 @@ Contents:
> net_failover
> alias
> bridge
> +   snmp_counter
>
>  .. only::  subproject
>
> diff --git a/Documentation/networking/snmp_counter.rst 
> b/Documentation/networking/snmp_counter.rst
> new file mode 100644
> index ..b1cfc70cd5f6
> --- /dev/null
> +++ b/Documentation/networking/snmp_counter.rst
> @@ -0,0 +1,222 @@
> +===
> +SNMP counter
> +===
> +
> +This document explains the meaning of SNMP counters.
> +
> +General IPv4 counters
> +
> +All layer 4 packets and ICMP packets will change these counters, but
> +these counters won't be changed by layer 2 packets (such as STP) or
> +ARP packets.
> +
> +* IpInReceives
> +Defined in `RFC1213 ipInReceives`_
> +
> +.. _RFC1213 ipInReceives: https://tools.ietf.org/html/rfc1213#page-26
> +
> +The number of packets received by the IP layer. It gets increasing at the
> +beginning of ip_rcv function, always be updated together with
> +IpExtInOctets. It indicates the number of aggregated segments after
> +GRO/LRO.
> +
> +* IpInDelivers
> +Defined in `RFC1213 ipInDelivers`_
> +
> +.. _RFC1213 ipInDelivers: https://tools.ietf.org/html/rfc1213#page-28
> +
> +The number of packets delivers to the upper layer protocols. E.g. TCP, UDP,
> +ICMP and so on. If no one listens on a raw socket, only kernel
> +supported protocols will be delivered, if someone listens on the raw
> +socket, all valid IP packets will be delivered.
> +
> +* IpOutRequests
> +Defined in `RFC1213 ipOutRequests`_
> +
> +.. _RFC1213 ipOutRequests: https://tools.ietf.org/html/rfc1213#page-28
> +
> +The number of packets sent via IP layer, for both single cast and
> +multicast packets, and would always be updated together with
> +IpExtOutOctets.
> +
> +* IpExtInOctets and IpExtOutOctets
> +They are linux kernel extensions, no RFC definitions. Please note,
> +RFC1213 indeed defines ifInOctets  and ifOutOctets, but they
> +are different things. The ifInOctets and ifOutOctets include the MAC
> +layer header size but IpExtInOctets and IpExtOutOctets don't, they
> +only include the IP layer header and the IP layer data.
> +
> +* IpExtInNoECTPkts, IpExtInECT1Pkts, IpExtInECT0Pkts, IpExtInCEPkts
> +They indicate the number of four kinds of ECN IP packets, please refer
> +`Explicit Congestion Notification`_ for more details.
> +
> +.. _Explicit Congestion Notification: 
> https://tools.ietf.org/html/rfc3168#page-6
> +
> +These 4 counters calculate how many packets received per ECN
> +status. They count the real frame number regardless the LRO/GRO. So
> +for the same packet, you might find that IpInReceives count 1, but
> +IpExtInNoECTPkts counts 2 or more.
> +
> +ICMP counters
> +
> +* IcmpInMsgs and IcmpOutMsgs
> +Defined by `RFC1213 icmpInMsgs`_ and `RFC1213 icmpOutMsgs`_
> +
> +.. _RFC1213 icmpInMsgs: https://tools.ietf.org/html/rfc1213#page-41
> +.. _RFC1213 icmpOutMsgs: https://tools.ietf.org/html/rfc1213#page-43
> +
> +As mentioned in the RFC1213, these two counters include errors, they
> +would be increased even if the ICMP packet has an invalid type. The
> +ICMP output path will check the header of a raw socket, so the
> +IcmpOutMsgs would still be updated if the IP header is constructed by
> +a userspace program.
> +
> +* ICMP named types
> +| These counters include most of common ICMP types, they are:
> +| IcmpInDestUnreachs: `RFC1213 icmpInDestUnreachs`_
> +| IcmpInTimeExcds: `RFC1213 icmpInTimeExcds`_
> +| IcmpInParmProbs: `RFC1213 icmpInParmProbs`_
> +| IcmpInSrcQuenchs: `RFC1213 icmpInSrcQuenchs`_
> +| IcmpInRedirects: `RFC1213 icmpInRedirects`_
> +| IcmpInEchos: `RFC1213 

a propose of snmp counter document

2018-11-08 Thread peng yu
I'm planing to write a document which explains the meaning of the
kernel snmp counters, and combine the explanations with some tests,
because I found lots of the 'TcpExt' and 'IpExt' counters are not
explained in any document. Here is a draft:
https://github.com/yupeng0921/iproute2_learning/blob/master/nstat.md
It is still on going. I think it might be useful. Besides put it on my
git repo, could someone have any suggestion about any place I
could contribute this document to?

Best regards


Re: [ftrace-bpf 1/5] add BPF_PROG_TYPE_FTRACE to bpf

2017-11-13 Thread peng yu
> 1. anything bpf related has to go via net-next tree.
I found there is a net-next git repo:
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git
I will use this repo for the further bpf-ftrace patch set.

> 2.
> this obviously breaks ABI. New types can only be added to the end.
Sure, I will add the new type at the end.

> 3.
> this won't even compile, since ftrace_regs is only added in the patch 4.
It could compile, as the ftrace_regs related code is inside the
"#ifdef FTRACE_BPF_FILTER" macro, if this macro is not defined, no
ftrace_regs related code would be compiled.


> Since bpf program will see ftrace_regs as an input it becomes
> abi, so has to be defined in uapi/linux/bpf_ftrace.h or similar.
> We need to think through how to make it generic across archs
> instead of defining ftrace_regs for each arch.
I'm not sure whether I'm fully understand your meaning. Like kprobe,
the ftrace-bpf need to get a function's parameters and check them. So
it won't be abi stable, and it should depend on architecture
implement. I can create a header file like uapi/linux/bpf_ftrace.h,
but I noticed that kprobe doesn't have such a header file, if I'm
wrong, please let me know. And about make it generic across archs, I
know kprobe use pt_regs as parameter, the pt_regs is defined on each
arch, so I can't see how bpf-ftrace can get a generic interface across
archs if it need to check function's parameters. If I misunderstand
anything, please let me know.

> 4.
> the patch 2/3 takes an approach of passing FD integer value in text form
> to the kernel. That approach was discussed years ago and rejected.
> It has to use binary interface like perf_event + ioctl.
> See RFC patches where we're extending perf_event_open syscall to
> support binary access to kprobe/uprobe.
> imo binary interface to ftrace is pre-requisite to ftrace+bpf work.
> We've had too many issues with text based kprobe api to repeat
> the same mistake here.
I notice the kprobe-bpf prog is set through the PERF_EVENT_IOC_SET_BPF
ioctl, I may try to see whether I can reuse this interface, or if it
is not suitable, I will try to define a new binary interface.

> 5.
> patch 4 hacks save_mcount_regs asm to pass ctx pointer in %rcx
> whereas it's only used in ftrace_graph_caller which doesn't seem right.
> It points out to another issue that such ftrace+bpf integration
> is only done for ftrace_graph_caller without extensibility in mind.
> If we do ftrace+bpf I'd rather see generic framework that applies
> to all of ftrace instead of single feature of it.
It is a hard problem. The ftrace framework has lots of tracers,
function tracer and function graph tracer use the 'gcc -pg' directly,
other tracers use tracepoint, I should spend more time to find a
suitable solution.


> 6.
> copyright line copy-pasted incorrectly.
I will fix it.

Summary:
The question 1,2 and 6 are easy to fix. About question 4, I need to do
more research, it shouldn't be very hard. About question 3 and 5, both

2017-11-12 17:02 GMT-08:00 Alexei Starovoitov :
> On Sun, Nov 12, 2017 at 07:28:24AM +, yupeng0...@gmail.com wrote:
>> Add a new type BPF_PROG_TYPE_FTRACE to bpf, let bpf can be attached to
>> ftrace. Ftrace pass the function parameters to bpf prog, bpf prog
>> return 1 or 0 to indicate whether ftrace can trace this function. The
>> major propose is provide an accurate way to trigger function graph
>> trace. Changes in code:
>> 1. add FTRACE_BPF_FILTER in kernel/trace/Kconfig. Let ftrace pass
>> function parameter to bpf need to modify architecture dependent code,
>> so this feature will only be enabled only when it is enabled in
>> Kconfig and the architecture support this feature. If an architecture
>> support this feature, it should define a macro whose name is
>> FTRACE_BPF_FILTER, e.g.:
>> So other code in kernel can check whether the macro FTRACE_BPF_FILTER
>> is defined to know whether this feature is really enabled.
>> 2. add BPF_PROG_TYPE_FTRACE in bpf_prog_type
>> 3. check kernel version when load BPF_PROG_TYPE_FTRACE bpf prog
>> 4. define ftrace_prog_func_proto, the prog input is a struct
>> ftrace_regs type pointer, it is similar as pt_regs in kprobe, it
>> is an architecture dependent code, if an architecture doens't define
>> FTRACE_BPF_FILTER, use a fake ftrace_prog_func_proto.
>> 5. add BPF_PROG_TYPE in bpf_types.h
>>
>> Signed-off-by: yupeng0...@gmail.com
>
> In general I like the bigger concept of adding bpf filtering to ftrace,
> but there are a lot of fundamental issues with this patch set.
>
> 1. anything bpf related has to go via net-next tree.
>
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -118,6 +118,7 @@ enum bpf_prog_type {
>>   BPF_PROG_TYPE_UNSPEC,
>>   BPF_PROG_TYPE_SOCKET_FILTER,
>>   BPF_PROG_TYPE_KPROBE,
>> + BPF_PROG_TYPE_FTRACE,
>>   BPF_PROG_TYPE_SCHED_CLS,
>
> 2.
> this obviously breaks ABI. New types can only be added to the end.
>
>> +static bool 

Re: [iproute PATCH] ss: add detail explains of -m, -o, -e and -i options in ss man page

2017-10-25 Thread peng yu
Thanks for your suggestion, below is a new patch. What I did:
1. change all 'package' to 'packet'
2. put my additional text as second paragraphs of the original options.
3. checked the man page by aspell
If anything else need to fix, please let me know.


commit 9803d27de31028733de789495d78ff7a39385009
Author: yupeng <yupeng0...@gmail.com>
Date:   Wed Oct 25 21:12:21 2017 +

add additional explain in ss man page

Add detail explains of -m, -o, -e and -i options, which are not
documented anywhere

Signed-off-by: Peng Yu <yupeng0...@gmail.com>

diff --git a/man/man8/ss.8 b/man/man8/ss.8
index 3bec97f..401ae3e 100644
--- a/man/man8/ss.8
+++ b/man/man8/ss.8
@@ -37,19 +37,152 @@ Display both listening and non-listening (for TCP
this means established connect
 Display only listening sockets (these are omitted by default).
 .TP
 .B \-o, \-\-options
-Show timer information.
+Show timer information. For tcp protocol, the output format is:
+.br
+timer:(,,)
+.br
+.B 
+the name of the timer, there are five kind of timer names:
+.RS
+.RS
+.BR on ": means one of these timers: tcp retrans timer, tcp early
retrans timer and tail loss probe timer"
+.br
+.BR keepalive ": tcp keep alive timer"
+.br
+.BR timewait ": timewait stage timer"
+.br
+.BR persist ": zero window probe timer"
+.br
+.BR unknown ": none of the above timers"
+.RE
+.B 
+how long time the timer will expire
+.br
+.B 
+how many times the retran occurs
+.RE
 .TP
 .B \-e, \-\-extended
-Show detailed socket information
+Show detailed socket information. The output format is:
+.br
+uid: ino: sk:
+.br
+.B 
+the user id the socket belongs to
+.br
+.B 
+the socket's inode number in VFS
+.br
+.B 
+an uuid of the socket
+
 .TP
 .B \-m, \-\-memory
-Show socket memory usage.
+Show socket memory usage. The output format is:
+.br
+skmem:(r,rb,t,tb,f,w,o,bl)
+.B 
+the memory allocated for receiving packet
+.br
+.B 
+the total memory can be allocated for receiving packet
+.br
+.B 
+the memory used for sending packet (which has been sent to layer 3)
+.br
+.B 
+the total memory can be allocated for sending packet
+.br
+.B 
+the memory allocated by the socket as cache, but not used for
receiving/sending packet yet. If need memory to send/receive packet,
the memory in this cache will be used before allocate additional
memory.
+.br
+.B 
+The memory allocated for sending packet (which has not been sent to layer 3)
+.br
+.B 
+The memory used for storing socket option, e.g., the key for TCP MD5 signature
+.br
+.B 
+The memory used for the sk backlog queue. On a process context, if
the process is receiving packet, and a new packet is received, it will
be put into the sk backlog queue, so it can be received by the process
immediately
 .TP
 .B \-p, \-\-processes
 Show process using socket.
 .TP
 .B \-i, \-\-info
-Show internal TCP information.
+Show internal TCP information. Below fields may appear:
+.br
+.B ts
+show string "ts" if the timestamp option is set
+.br
+.B sack
+show string "sack" if the sack option is set
+.br
+.B ecn
+show string "ecn" if the explicit congestion notification option is set
+.br
+.B ecnseen
+show string "ecnseen" if the saw ecn flag is found in received packets
+.br
+.B fastopen
+show string "fastopen" if the fastopen option is set
+.br
+.B cong_alg
+the congestion algorithm name, the default congestion algorithm is "cubic"
+.br
+.B wscale::
+if window scale option is used, this field shows the send scale
factory and receive scale factory
+.br
+.B rto:
+tcp re-transmission timeout value, the unit is millisecond
+.br
+.B backoff:
+used for exponential backoff re-transmission, the actual
re-transmission timeout value is icsk_rto << icsk_backoff
+.br
+.B rtt:/
+rtt is the average round trip time, rttvar is the mean deviation of
rtt, their units are millisecond
+.br
+.B ato:
+ack timeout, unit is millisecond, used for delay ack mode
+.br
+.B mss:
+max segment size
+.br
+.B cwnd:
+congestion window size
+.br
+.B ssthresh:
+tcp congestion window slow start threshold
+.br
+.B bytes_acked:
+bytes acked
+.br
+.B bytes_received:
+bytes received
+.br
+.B segs_out:
+segments sent out
+.br
+.B segs_in:
+segments received
+.br
+.B send bps
+egress bps
+.br
+.B lastsnd:
+how long time since the last packet sent, the unit is millisecond
+.br
+.B lastrcv:
+how long time since the last packet received, the unit is millisecond
+.br
+.B lastack:
+how long time since the last ack received, the unit is millisecond
+.br
+.B pacing_rate bps/bps
+the pacing rate and max pacing rate
+.br
+.B rcv_space:
+a helper variable for TCP internal auto tuning socket receive buffer
+
 .TP
 .B \-K, \-\-kill
 Attempts to forcibly close sockets. This option displays sockets that are

2017-10-25 8:06 GMT-07:00 Roman Mashak <m...@mojatatu.com>:
> peng yu <yupeng0...@gmail.com> writes:
>
>> commit 340a45f79395144bd14

[iproute PATCH] ss: add detail explains of -m, -o, -e and -i options in ss man page

2017-10-24 Thread peng yu
commit 340a45f79395144bd14fdf9be1904c0036456b6e
Author: yupeng <yupeng0...@gmail.com>
Date:   Tue Oct 24 23:55:29 2017 +

add additional explain in ss man page

Add detail explains of -m, -o, -e and -i options, which are not
documented anywhere

Signed-off-by: Peng Yu <yupeng0...@gmail.com>

diff --git a/man/man8/ss.8 b/man/man8/ss.8
index 3bec97f..4597733 100644
--- a/man/man8/ss.8
+++ b/man/man8/ss.8
@@ -176,6 +176,147 @@ states except for
 - opposite to
 .B bucket

+.SH Additional explain of -m, -o, -e and -i options
+Some fields may have different meanings if the netowrk protocl is
different. Below explain focus on tcp protocol.
+.TP
+.B -m option
+skmem:(r,rb,t,tb,f,w,o,bl)
+
+.B 
+the memory allocated for receiving package
+
+.B 
+the total memory can be allocated for receiving package
+
+.B 
+the memory used for sending package (which has been sent to layer 3)
+
+.B 
+the total memory can be allocated for sending package
+
+.B 
+the memory allocated by the socket as cache, but not used for
receiving/sending pacakge yet. If need memory to send/receive package,
the memory in this cache will be used before allocate additional
memory.
+
+.B 
+The memory allocated for sending package (which has not been sent to layer 3)
+
+.B 
+The memory used for storing socket option, e.g., the key for TCP MD5 signature
+
+.B 
+The memory used for the sk backlog queue. On a process context, if
the process is receving package, and a new package is received, it
will be put into the sk backlog queue, so it can be received by the
process immediately
+.TP
+.B -o option
+timer:(,,)
+
+.B 
+the name of the timer, there are five kind of timer names:
+
+.BR on ": means one of these timers: tcp retrans timer, tcp early
retrans timer and tail loss probe timer"
+
+.BR keepalive ": tcp keep alive timer"
+
+.BR timewait ": timewait stage timer"
+
+.BR persist ": zero window probe timer"
+
+.BR unknown ": none of the above timers"
+
+.B 
+how long time the timer will expire
+
+.B 
+how many times the retran occurs
+.TP
+
+.B -e option
+uid: ino: sk:
+
+.B 
+the user id the socket belongs to
+
+.B 
+the socket's inode number in VFS
+
+.B 
+an uuid of the socket
+
+.TP
+.B -i option
+show tcp internal information
+
+.B ts
+show string "ts" if the timestamp option is set
+
+.B sack
+show string "sack" if the sack option is set
+
+.B ecn
+show string "ecn" if the explicit congestion notification option is set
+
+.B ecnseen
+show string "ecnseen" if the saw ecn flag is found in received packages
+
+.B fastopen
+show string "fastopen" if the fastopen option is set
+
+.B cong_alg
+the congestion algorithm name, the default congestion algorithm is "cubic"
+
+.B wscale::
+if window scale option is used, this field shows the send scale
factory and receive scale factory
+
+.B rto:
+tcp retransmission timeout value, the unit is millisecond
+
+.B backoff:
+used for exponential backoff retransmission, the actual
retransmission timeout vaule is icsk_rto << icsk_backoff
+
+.B rtt:/
+rtt is the average round trip time, rttvar is the mean deviation of
rtt, their units are millisecond
+
+.B ato:
+ack timeout, unit is millisecond, used for delay ack mode
+
+.B mss:
+max segment size
+
+.B cwnd:
+congestion window size
+
+.B ssthresh:
+tcp congestion window slow start threshold
+
+.B bytes_acked:
+bytes acked
+
+.B bytes_received:
+bytes received
+
+.B segs_out:
+segments sent out
+
+.B segs_in:
+segments received
+
+.B send bps
+egress bps
+
+.B lastsnd:
+how long time since the last package sent, the unit is millisecond
+
+.B lastrcv:
+how long time since the last package received, the unit is millisecond
+
+.B lastack:
+how long time since the last ack received, the unit is millisecond
+
+.B pacing_rate bps/bps
+the pacing rate and max pacing rate
+
+.B rcv_space:
+a helper variable for TCP internal auto tunning socket receive buffer
+
 .SH USAGE EXAMPLES
 .TP
 .B ss -t -a