Dan:

On Thu, 3 Aug 2006, Dan Celidonio wrote:

Hi Venu,

Thanks for your response.

I tried your dtrace scripts for both tcp_conn_request()
and tcp_rput_data() twice on the server node.  The first
time I del'ed out of the tcp_rput_data dtrace before the
client terminated the connection.  The secind time I let
it go until the connection was terminated by the client.

This test is just a simple telnet.

I see this on the client:

Connected to xxx.xxx.xxx.xx. [actual IP not shown]
Escape character is '^]'.
Connection closed by foreign host.

Looking at the below results it seems that tcp_rput_data()
never sees the ACK sent by the client.  However, I do see the
ACK come in from the client in our Streams module before
we send it up the IP stack.  We send it up to IP via a
call to ip_input().  I'll need to do more digging.

I'd check ip_tcp_input().


Thanks,
Dan

1st try
=======
[There was just one print in tcp_conn_request()]

That's right, only the SYN comes in here, rest should be in tcp_rput_data().


CPU FUNCTION
 0  -> tcp_conn_request     TCP 30002e13600 state -3 suna 0 snxt 0 rnxt 0
       lport 52203 dport 23
       seq 2476877873 ack 0 flags 2

[There were many prints in tcp_rput_data()]
I removed all prints which didn't have port 23 printed.


You can make the dtrace probe conditional on destination port 23, i.e.
using my prev example, something like:

fbt::tcp_conn_request:entry
/ *(uint16_t *)(((tcph_t *)((uchar_t *)(ipha_t *)((mblk_t *)arg1)->b_rptr  + 
20))->th_fport) == 23 /
{
        ....
}


I terminated the dtrace before the client terminated
the connection.

CPU FUNCTION
 0    -> tcp_rput_data                         TCP 30002dfe9c0 state -1 suna
614917337 snxt 614917338 rnxt 2476877874
       lport 52203 dport 23
       seq 2476877873 ack 0 flags 2

 0    -> tcp_rput_data                         TCP 30002dfe9c0 state -1 suna
614917337 snxt 614917338 rnxt 2476877874
       lport 52203 dport 23
       seq 2476877873 ack 0 flags 2

 0  -> tcp_rput_data                           TCP 30002dfe9c0 state -1 suna
614917337 snxt 614917338 rnxt 2476877874
       lport 52203 dport 23
       seq 2476877873 ack 0 flags 2

 0    -> tcp_rput_data                         TCP 30002dfe9c0 state -1 suna
614917337 snxt 614917338 rnxt 2476877874
       lport 52203 dport 23
       seq 2476877873 ack 0 flags 2


From the above looks like we keep getting SYN (flags = 2). Do you see these
SYN in the snoop as well? Does the other side get the SYN-ACK (and send the
ACK)?


2nd try
=======
[There was just one print in tcp_conn_request()]

CPU FUNCTION
 0  -> tcp_conn_request     TCP 30002e13600 state -3 suna 0 snxt 0 rnxt
       lport 52463 dport 23
       seq 2667176845 ack 0 flags 2

[There were many prints in tcp_rput_data()]
I removed all prints which didn't have port 23 printed.
I let it run until the telnet connection was closed by
the client.

CPU FUNCTION
 0    -> tcp_rput_data                         TCP 30002dfd100 state -1 suna
774266961 snxt 774266962 rnxt 2667176846
       lport 52463 dport 23
       seq 2667176845 ack 0 flags 2

 0    -> tcp_rput_data                         TCP 30002dfd100 state -1 suna
774266961 snxt 774266962 rnxt 2667176846
       lport 52463 dport 23
       seq 2667176845 ack 0 flags 2

 0  -> tcp_rput_data                           TCP 30002dfd100 state -1 suna
774266961 snxt 774266962 rnxt 2667176846
       lport 52463 dport 23
       seq 2667176845 ack 0 flags 2

 0    -> tcp_rput_data                         TCP 30002dfd100 state -1 suna
774266961 snxt 774266962 rnxt 2667176846
       lport 52463 dport 23
       seq 2667176845 ack 0 flags 2

 0      -> tcp_rput_data                       TCP 30002dfd100 state -1 suna
774266961 snxt 774266962 rnxt 2667176846
       lport 52463 dport 23
       seq 2667176870 ack 4250440928 flags 10

this probably causes bad_ack... since the ack is greater than tcp_snxt.
(Look at the comments in tcp_rput_data, LL12779-12790).

-venu


 0  -> tcp_rput_data                           TCP 30002dfd100 state -1 suna
774266961 snxt 774266962 rnxt 2667176846
       lport 52463 dport 23
       seq 2667176846 ack 4250440928 flags 18

 0    -> tcp_rput_data                         TCP 30002dfd100 state -1 suna
774266961 snxt 774266962 rnxt 2667176846
       lport 52463 dport 23
       seq 2667176870 ack 4250440928 flags 10

 0    -> tcp_rput_data                         TCP 30002dfd100 state -1 suna
774266961 snxt 774266962 rnxt 2667176846
       lport 52463 dport 23
       seq 2667176846 ack 4250440928 flags 18

 0      -> tcp_rput_data                       TCP 30002dfd100 state -1 suna
774266961 snxt 774266962 rnxt 2667176846
       lport 52463 dport 23
       seq 2667176870 ack 4250440928 flags 10

 0  -> tcp_rput_data                           TCP 30002dfd100 state -1 suna
774266961 snxt 774266962 rnxt 2667176846
       lport 52463 dport 23
       seq 2667176846 ack 4250440928 flags 18

 0    -> tcp_rput_data                         TCP 30002dfd100 state -1 suna
774266961 snxt 774266962 rnxt 2667176846
       lport 52463 dport 23
       seq 2667176870 ack 4250440928 flags 10

 0  -> tcp_rput_data                           TCP 30002dfd100 state -1 suna
774266961 snxt 774266962 rnxt 2667176846
       lport 52463 dport 23
       seq 2667176846 ack 4250440928 flags 18


-----Original Message-----
From: venugopal iyer <[EMAIL PROTECTED]>
Sent: Aug 3, 2006 2:35 PM
To: Dan Celidonio <[EMAIL PROTECTED]>
Cc: [email protected]
Subject: Re: [networking-discuss] problem with GLDV3 bge interface



[...]

Some of the problems seen on the server using the
newly converted load balancer driver for bge GLDV3
interface are below.  We set tcp_trace and tcp_debug and use
strace to see these errors on the server node.

1.  The SYN packet coming from the client into the
   traffic cop node and forwarded to the server node
   is getting an "unacceptable sequence number gap"
   error from the TCP layer on the server.

2.  TCP Duplicate segment counter increments
   from the TCP layer on the server.

3.  TCP Duplicate Ack counter increments
   from the TCP layer on the server.

4.  "bad_ack" in SYN_RCVD state.

To get a sense for what's happening in TCP, I'd use dtrace in
tcp_conn_request() and tcp_rput_data() and print out the tcp_t
(get this from the conn_t in arg0) along with its state, snxt,
suna and rnxt. Additionally, also print out the TCP header from
the incoming packet (arg1), the seq, ack and flags, something
like:

fbt::tcp_conn_request:entry
{
        connp = (conn_t *)arg0;
        tcp = (tcp_t *)connp->conn_tcp;
        ipha = (ipha_t *)((mblk_t *)arg1)->b_rptr;
        /* 20 - assuming no IP options etc. */
        tcph = (tcph_t *)((uchar_t *)ipha  + 20);

        printf("\tTCP %p state %d suna %u snxt %u rnxt %u\n",
            tcp, tcp->tcp_state, tcp->tcp_suna, tcp->tcp_snxt,
            tcp->tcp_rnxt);
        printf("\tlport %u dport %u\n", *(uint16_t *)tcph->th_lport,
            *(uint16_t *)tcph->th_fport);
        printf("\tseq %u ack %u flags %x\n",
            *(uint32_t *)tcph->th_seq, *(uint32_t *)tcph->th_ack,
            *tcph->th_flags);
}

similarly for tcp_rput_data.

(on an x86 machine you would need to print out the TCP header content
taking endianness into consideration).

I suppose you could restrict the dtrace probes to the interested port.

Ordinarily, I'd expect the SYN to come into tcp_conn_request(),
which will result in sending the SYN-ACK and transitioning to
SYN-RCVD.  The ACK from the client will be processed in tcp_rput_data.

You can check the code in tcp_rput_data() that result in the
messages/conditions that you list above.

Look at:
  http://cvs.opensolaris.org/source/xref/on/usr/src/uts/common/inet/tcp.h

for the TCP structure (tcp_t) and TCP header structure (tcph_t)

Look at:
  http://cvs.opensolaris.org/source/xref/on/usr/src/uts/common/inet/tcp/tcp.c

for tcp_conn_request() and tcp_rput_data() - L12764-12789, L12903-12961 etc.

-venu



Thanks,
Dan


This message posted from opensolaris.org
_______________________________________________
networking-discuss mailing list
[email protected]
















_______________________________________________
networking-discuss mailing list
[email protected]

Reply via email to