Hi George,

My test env is snv_93, which already contains the fix of 6637163.

The other thing is, the same test case could pass on another NIC driver 
on the same test env.  For this reason, I suspected it is a driver issue.
But it is still possible to be a upper layer issue, because the NIC 
driver without this issue just has one ring, and the ring size is bigger 
than the driver(has 4 rings) with this issue. 

I'll do further testing, and before this, could you give me some hints 
on this issue? What can I do for further debugging?
 
George Shepherd wrote:
> Hi Oliver.
>
> Rather than a big reassembly list I suspect that you are hitting this bug.
>
> 6637163 ip_rput_fragment[_v6]() spuriously prunes valid frags due to 
> unbounded 
> inaccuracy of ill_frag_count
>
> The fix is currently in T-patch form (afaik as I can't look it up right now)
> 137111-05.
>
> HTH
> -George
>
> *>Date: Sun, 03 Aug 2008 16:59:05 +0800
> *>From: "Oliver.Yang" <[EMAIL PROTECTED]>
> *>Subject: [networking-discuss] A question about IP reassembly
> *>To: [email protected]
> *>MIME-version: 1.0
> *>Content-transfer-encoding: 7BIT
> *>X-BeenThere: [email protected]
> *>Delivered-to: [email protected]
> *>X-PMX-Version: 5.4.1.325704
> *>X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on 
> oss-mail1.opensolaris.org
> *>X-Original-To: [email protected]
> *>X-Antispam: No, score=0.0/5.0, scanned in 0.187sec at (localhost 
> [127.0.0.1]) 
> by smf-spamd v1.3.1 - http://smfs.sf.net/
> *>X-Mailman-Version: 2.1.9
> *>List-Post: <mailto:[email protected]>
> *>List-Subscribe:  
> <http://mail.opensolaris.org/mailman/listinfo/networking-discuss>, 
> <mailto:[EMAIL PROTECTED]>
> *>List-Unsubscribe:  
> <http://mail.opensolaris.org/mailman/listinfo/networking-discuss>, 
> <mailto:[EMAIL PROTECTED]>
> *>List-Archive: <http://mail.opensolaris.org/pipermail/networking-discuss>
> *>List-Help: <mailto:[EMAIL PROTECTED]>
> *>List-Id: Networking General Discussion <networking-discuss.opensolaris.org>
> *>User-Agent: Thunderbird 2.0.0.16 (Windows/20080708)
> *>X-Spam-Status: No, score=-3.6 required=5.0 tests=AWL,RCVD_IN_DNSWL_MED 
> autolearn=unavailable version=3.2.3
> *>X-Spam-Level: 
> *>
> *>Hi All,
> *>
> *>What reason could cause a big reassembly list on the specific ILL?
> *>
> *>I ran into a NIC driver UDP performance issue recently. After the 
> *>debugging, I found it caused by a packets dropping issue in upper layer.
> *>
> *>The netstat output showed  ipReasmFails is extremely high while this 
> *>issue is happened:
> *>
> *># netstat -I igb1 -s -P ip                     
> *>Name  Mtu  Net/Dest      Address        Ipkts  Ierrs Opkts  Oerrs Collis 
> *>Queue
> *>igb1  1500 11.0.3.0      11.0.3.11      7716786 0     26538  0     
> *>0      0    
> *>
> *>IPv4    ipForwarding        =     2     ipDefaultTTL        =   255
> *>        ipInReceives        =7718245    ipInHdrErrors       =     0
> *>        ipInAddrErrors      =     0     ipInCksumErrs       =     0
> *>        ipForwDatagrams     =     0     ipForwProhibits     =     0
> *>        ipInUnknownProtos   =    70     ipInDiscards        =     0
> *>        ipInDelivers        = 48737     ipOutRequests       = 32338
> *>        ipOutDiscards       =     1     ipOutNoRoutes       =     0
> *>        ipReasmTimeout      =    60     ipReasmReqds        =690804
> *>        ipReasmOKs          =    83     ipReasmFails        =695798
> *>        ipReasmDuplicates   =     0     ipReasmPartDups     =     0
> *>        ipFragOKs           =     0     ipFragFails         =     0
> *>        ipFragCreates       =     0     ipRoutingDiscards   =     0
> *>        tcpInErrs           =     0     udpNoPorts          =   118
> *>        udpInCksumErrs      =     0     udpInOverflows      =     0
> *>        rawipInOverflows    =     0     ipsecInSucceeded    =     0
> *>        ipsecInFailed       =     0     ipInIPv6            =     0
> *>        ipOutIPv6           =     0     ipOutSwitchIPv6     =   106
> *>
> *>
> *>My dtrace command showed the massive packets were dropped by 
> *>ill_frag_prune in IP layer:
> *>
> *>bash-3.2# dtrace -n mib:::ipIfStatsReasmFails'[EMAIL 
> PROTECTED]()]=count();}' 
> *>dtrace: description 'mib:::ipIfStatsReasmFails' matched 3 probes
> *>^C
> *>              ip`ill_frag_free_pkts+0x84
> *>              ip`ill_frag_prune+0x1d0
> *>              ip`ip_rput_fragment+0x39c
> *>              ip`ip_udp_input+0x80c
> *>              ip`ip_input+0xb94
> *>              dls`soft_ring_drain+0x68
> *>              dls`soft_ring_worker+0x5c
> *>              unix`thread_start+0x4
> *>                1
> *>
> *>              ip`ip_mib2_add_ip_stats+0x1ac
> *>              ip`ip_snmp_get_mib2_ip_traffic_stats+0x130
> *>              ip`ip_snmp_get+0xd8
> *>              ip`snmpcom_req+0x280
> *>              ip`ip_wput_nondata+0x674
> *>              unix`putnext+0x208
> *>              genunix`strput+0x1a0
> *>              genunix`strputmsg+0x26c
> *>              genunix`msgio32+0x398
> *>              genunix`putmsg32+0x98
> *>              unix`syscall_trap32+0xcc
> *>                6
> *>
> *>              ip`ip_mib2_add_ip_stats+0x1ac
> *>              ip`ip_snmp_get_mib2_ip_traffic_stats+0x130
> *>              ip`ip_snmp_get+0xd8
> *>              ip`snmpcom_req+0x280
> *>              ip`ip_wput_nondata+0x674
> *>              unix`putnext+0x208
> *>              unix`putnext+0x208
> *>              unix`putnext+0x208
> *>              unix`putnext+0x208
> *>              unix`putnext+0x208
> *>              genunix`strput+0x1a0
> *>              genunix`strputmsg+0x26c
> *>              genunix`msgio32+0x398
> *>              genunix`putmsg32+0x98
> *>              unix`syscall_trap32+0xcc
> *>               66
> *>
> *>              ip`ill_frag_free_pkts+0x84
> *>              ip`ill_frag_prune+0x100
> *>              ip`ip_rput_fragment+0x39c
> *>              ip`ip_udp_input+0x80c
> *>              ip`ip_input+0xb94
> *>              dls`soft_ring_drain+0x68
> *>              dls`soft_ring_worker+0x5c
> *>              unix`thread_start+0x4
> *>           232825
> *>
> *>
> *>It seems it is caused by a big reassembly list:
> *>
> *>        /* If the reassembly list for this ILL will get too big, prune it */
> *>        if ((msg_len + sizeof (*ipf) + ill->ill_frag_count) >=
> *>            ipst->ips_ip_reass_queue_bytes) {
> *>                ill_frag_prune(ill,
> *>                    (ipst->ips_ip_reass_queue_bytes < msg_len) ? 0 :
> *>                    (ipst->ips_ip_reass_queue_bytes - msg_len));
> *>                pruned = B_TRUE;
> *>        }
> *>
> *>But I want to know the major reasons which could cause the big 
> *>reassembly list on the specific ILL.
> *>
> *>Could you shed some light on this issue? Thanks!
> *>
> *>-- 
> *>Cheers,
> *>
> *>------------------------------------------------------------
> *>Oliver Yang | http://blog.csdn.net/yayong
> *>
> *>_______________________________________________
> *>networking-discuss mailing list
> *>[email protected]
>
>
> George Shepherd
> http://clem.uk/~georges/
> ==============================================================================
>    Solaris Revenue Product Engineering:    |  SUN Microsystems
>        Core team  -Internet                |  Guillemont Park
>    Email: [EMAIL PROTECTED]          |  Camberley GU17 9QG
>    Disclaimer: Less is more, more or less  |  United Kingdom 
> ==============================================================================
>
>   


-- 
Cheers,

------------------------------------------------------------
Oliver Yang | http://blog.csdn.net/yayong

_______________________________________________
networking-discuss mailing list
[email protected]

Reply via email to