Re: [j-nsp] SRX 3600 dropped packets - how to debug?
Hey Phil, A friendly hello from Lancaster Uni, also using SRX 3600's. Can you reproduce the loss? Or alternatively know source/destination ranges of likely connections? A user it's more likely to affect or can demonstrate it reliably? As pretty much unless this is a policy that's doing it (if you have then deny, then get a then count on all those rules too, but it sounds like packet loss rather than session creation rejection/failure/timeout), you're gonna be stuck doing a datapath debug. http://www.juniper.net/techpubs/software/junos-security/junos-security10.2/junos-security-swconfig-security/topic-41983.html If you're shifting anywhere like the amount of traffic we are you aren't going to want to set up a filter for 0/0 to 0/0. Something I've had to explain to JTAC on numerous occasions (something along the lines of You want me to enable full flow debugging on three SPC's collectively pushing 8Gbps!?!). Also you using anything like AppTrack and AppFW/AppQos/AppDos? I've unfortunately had a fair amount of experience with datapath debugs, so feel free to give me a shout off list. Cheers, Peter. -- Peter Wood Network Security Specialist Information Systems Services Lancaster University Email: p.w...@lancaster.ac.uk ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] SRX 3600 dropped packets - how to debug?
On 24/05/13 11:33, Wood, Peter (ISS) wrote: Hey Phil, A friendly hello from Lancaster Uni, also using SRX 3600's. Can you reproduce the loss? Or alternatively know source/destination ranges of likely connections? A user it's more likely to affect or can demonstrate it reliably? Depends what you mean by reproduce. The counter in question is rising continually, so (assuming that counter can be trusted) it's happening continually. But I have no idea *what* traffic might be being dropped. Someone suggested to me that this counter might include sessions where the 3-way handshake is not completed successfully, which if true might account for it, but several hundred/sec seems too high for that. To be clear: we don't have any *reports* of packet loss (well, not since I upgraded to 12.1R6.5 to fix PSN-2012-10-754 ;o) - it's just the counter value incrementing that has me concerned. Could the counter be wrong/misleading? As pretty much unless this is a policy that's doing it (if you have then deny, then get a then count on all those rules too, but it sounds like packet loss rather than session creation rejection/failure/timeout), you're gonna be stuck doing a datapath debug. I did investigate the datapath debug and flow tracing (see below) but neither suggested anything like the rate of events required to match the rate of counter increments. There was a background of: CID-00:FPC-11:PIC-00:THREAD_ID-08:RT:SPU invalid session id ...when I had flowtracing enabled, but that seemed to be ~10-20/sec. Unsure if it's related. Slightly OT, I did spend some time thinking it was dropping some fragmented packets, but that was a red herring - I didn't realise the SRX re-assembles then re-fragments IP frags, which means if some PPPoE customer sends you: packet 0-1400 packet 1400-1450 ...the SRX will merge them into a single unfragmented packet on egress - until I realised this, I was missing the egress non-fragment, and thinking they'd been dropped. http://www.juniper.net/techpubs/software/junos-security/junos-security10.2/junos-security-swconfig-security/topic-41983.html If you're shifting anywhere like the amount of traffic we are you aren't going to want to set up a filter for 0/0 to 0/0. Something I've had to explain to JTAC on numerous occasions (something along the lines of You want me to enable full flow debugging on three SPC's collectively pushing 8Gbps!?!). At the moment, the SRX is sitting in front of our personally owned VRF; this means all our wireless and wired laptops, and RAS VPN address ranges. This is doing about 1Gbps, which is probably still more than I can sensibly debug with flow tracing or packet capture. It is a shame there isn't a datapath-debug drops. Also you using anything like AppTrack and AppFW/AppQos/AppDos? They were enabled at one point, but I disabled them whilst investigating the above-mentioned loss/PSN, and haven't turned them back on yet. I've unfortunately had a fair amount of experience with datapath debugs, so feel free to give me a shout off list. That's... slightly ominous! I did wonder about interpretation; the pcap header contains various bits of metadata, but it's unclear to me how to interpret those, and which ones are valuable and which not. Is there any decent guide to that? Completely unrelated, can I ask if you have separate NPCs or the newer integrated IOC/NPC, and whether you have any comments pro or con the latter? Cheers, Phil ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] ISSU timeouts on MX upgrades due to large routing tables?
Richard, We are running into similar issues with NSR. Are you running with GRES enabled or have you removed that as well ? Jasper On Thu, May 23, 2013 at 12:44 AM, Richard A Steenbergen r...@e-gerbil.netwrote: On Tue, May 21, 2013 at 09:01:57PM -0400, Clarke Morledge wrote: I was curious to know if anyone has run into any issues with large routing tables on an MX causing ISSU upgrades to fail? On several occasions, I have been able to successfully do an In-Software-Service-Upgrade (ISSU) in a lab environment but then it fails to work in production. I find it difficult to replicate the issue in a lab, since in production I am dealing with lots of routes as compared to a small lab. Does anyone have any experience when the backup RE gets its new software, then reboots, but since it takes a long time to populate the routing kernel database on the newly upgraded RE that it appears to timeout? I have seen behavior like this with upgrades moving from 10.x to a newer 10.y and from 10.x to 11.y. We had that issue for many years. There is a hard-coded timeout in the NSR process which is very easy to hit if you have a box with a large number of routes. We had a case open on it for about 1.5 years, but Juniper refused to actually fix it (it works fine in the lab), and eventually we just gave us and declared ISSU to be dead. There are way too many other bugs with it anyways, even turning on NSR caused nothing but problems. -- Richard A Steenbergen r...@e-gerbil.net http://www.e-gerbil.net/ras GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC) ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] SRX 3600 dropped packets - how to debug?
- Original Message - From: Phil Mayers p.may...@imperial.ac.uk To: Wood, Peter (ISS) p.w...@lancaster.ac.uk Cc: juniper-nsp@puck.nether.net Sent: Friday, May 24, 2013 12:02 PM Subject: Re: [j-nsp] SRX 3600 dropped packets - how to debug? At the moment, the SRX is sitting in front of our personally owned VRF; this means all our wireless and wired laptops, and RAS VPN address ranges. If You run any kind peer-to-peer apps (uTorrent, eMule, etc, also includes Skype) then You'll see that outside peers trying to establish LOADS of unsolicited connection to Your inside hosts. And all of them will be dropped unless You enable full cone NAT. HTH Thanks Alex ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] SRX 3600 dropped packets - how to debug?
On 24/05/13 16:05, Alex Arseniev wrote: At the moment, the SRX is sitting in front of our personally owned VRF; this means all our wireless and wired laptops, and RAS VPN address ranges. If You run any kind peer-to-peer apps (uTorrent, eMule, etc, also includes Skype) then You'll see that outside peers trying to establish LOADS of unsolicited connection to Your inside hosts. And all of them will be dropped unless You enable full cone NAT. Good suggestion, but that's not it. Firstly we don't have *any* NAT in play - all the devices are on public IPs. Secondly, as mentioned all the policies are default permit, so any unsolicited connections would be allowed. Thirdly, this SRX is actually behind *another* firewall (Netscreen 5400s) that will eat the unsolicited connections before the SRX sees them ;o) Related to that 3rd item, as per my other email *if* that counter would increment for failed 3-way handshakes, it's possible that the drops are failed sessions which are allowed by the permit-all on the SRX, but then denied by the Netscreen (e.g. SMTP/25, SMB/139, which we block outbound). So, as per my other email - does anyone know *what* that counter is counting? ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
[j-nsp] RPM Probes with Event Options
Hi All, We need to create a test condition to integrate JUNIPER with a PeerAPP proxy appliance. The server has 4 giga interfaces (four IP blocks /30) and we need to test them (using rpm ping probes) to generate the conditions for a event-options script. Is there some way to generate an event-options script that consider more than one condition (4 rpm icmp probes tested) mixed togheter using AND logic ? We need only to turn off the firewall filter for redirect the traffic, when the 4 interfaces went donw ... I have found the following sample config: http://www.juniper.net/techpubs/en_US/junos12.2/topics/topic-map/junos-script-automation-event-policy-change-configuration.html But there is only some examples considering one interface and one match condition inside the policy-options. Do you know some options to considering more that one test probe from RPM ? Thanks a lot, Giuliano ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp