Re: enterprise change/configuration management and compliance software?
On Mon, Apr 14, 2008 at 9:13 PM, jamie <[EMAIL PROTECTED]> wrote: > Gentlemen (and Ren!):;-) > > I'm currently investigating options w.r.t. enterprise-wide (over 250 > device, and by 'device' i mean router and/or switch) configuration > management (and (ideally) compliance-auditing_and_assurance) software. > > We currently use Voyence (now EMC) and are looking into other options for > various reasons, support being in the top-3 ... > > So, I pose: To you operators of multi-hundred-device networks : what do > you use for such purposes(*) ? > (*)see subject We have several thousand network devices currently in play: [EMAIL PROTECTED]:/tftp/conf/latest> ls *.conf | wc -l 7419 [EMAIL PROTECTED]:/tftp/conf/latest> I hand read each device configuration check-in email that goes past to see if there's errors in the configs, security violations, or other WTF-ish elements in the config check-in, and mail back a nag notice to the person who changed the config. Currently, I received between 1900 and 3000 email messages a day. I sleep 3 hours a night. > jamie rishaw Hope that helps answer your question. Matt
Re: Yahoo Mail Update
On Mon, Apr 14, 2008 at 6:18 AM, Rich Kulawiec <[EMAIL PROTECTED]> wrote: > On Sun, Apr 13, 2008 at 03:55:13PM -0500, Ross wrote: > > Again I disagree with the principle that this list should be used for > > mail operation issues but maybe I'm just in the wrong here. > > I don't think you're getting what I'm saying, although perhaps I'm > not saying it very well. > > What I'm saying is that operational staff should be *listening* to > relevant lists (of which this is one) and that operational staff > should be *talking* on lists relevant to their particular issue(s). Completely agree. > Clearly, NANOG is probably not the best place for most SMTP or HTTP > issues, but some of the time, when those issues appear related to > topics appropriate for NANOG, it might be. The rest of the time, > the mailop list is probably more appropriate. > > While I prefer to see topics discussed in the "best place" (where > there is considerable debate over what that might be) I think that > things have gotten so bad that I'm willing to settle for, in the > short term, "a place", because it's easier to redirect a converation > once it's underway that it seems to be to start one. > > For example: the silence from Yahoo on this very thread is deafening. I think if you check historically, you'll find that Yahoo network operations team members are doing exactly as you indicate, and are "*talking* on lists relevant to their particular issue(s)" that is to say, here on NANOG, when it comes to networking issues, deafening silence has not been the modus operandus. The mistaken notion that a *network operations* list should have people on it to address mail server response code complaints is where I disagree with you. Ask about a BGP leakage, it'll get fixed. Enquire about how to engage in peering with Yahoo, you'll get flooded with answers; those are items the folks who read the list are empowered to deal with. Asking about topics not related to the list that they aren't empowered to deal with are going to be met with silence, because you're trying to talk to the wrong people in the wrong forum. > ---Rsk Matt --always speaking for himself--his employer is more likely to pay him to shut up.
Re: Yahoo Mail Update
On 4/10/08, chuck goolsbee <[EMAIL PROTECTED]> wrote: > >An anonymous source at Yahoo told me that they have pushed > > a config update sometime today out to their servers to help with these > > deferral issues. > > > >Please don't ask me to play proxy on this one of any > > other issues you may have, but take a look at your queues and > > they should be getting better. > > > >- Jared > > Thanks for the update Jared. I can understand your request to not be used > as a proxy, but it exposes the reason why Yahoo is thought to be clueless: > They are completely opaque. > > They can not exist in this community without having some visibity and > interaction on an operational level. > > Yahoo should have a look at how things are done at AOL. While the feedback > loop from the *users* at AOL is mostly a source of entertainment, dealing > with the postmaster staff at AOL is a benchmark in how it should be done. *heh* Well, depending upon how the battle turns out, Yahoo is likely to go the way of whomever its new partner will be--which will either be more like AOL, or more like Hotmail. Sounds like there's already some amount of preference at least among this group as to which way they'd prefer to see the battle go. ^_^; Matt > Proxy that message over and perhaps this issue of Yahoo's perennially > broken mail causing the rest of us headaches will go away. It seems to come > up here on nanog and over on the mailop list every few weeks. > > --chuck
Re: /24 blocking by ISPs - Re: Problems sending mail to yahoo?
On 4/11/08, Raymond L. Corbin <[EMAIL PROTECTED]> wrote: > > It's not unusual to do /24 blocks, however Yahoo claims they do not keep any > logs as to what causes the /24 block. If they kept logs and were able to tell > us which IP address in the /24 sent abuse to their network we would then be > able to investigate it. Their stance of 'it's coming from your network you > should know' isn't really helpful in solving the problem. When an IP is > blocked a lot of ISP's can tell you why. I would think when they block a /24 > they would atleast be able to decipher who was sending the abuse to their > network to cause the block and not simply say 'Were sorry our anti-spam > measures do not conform with your business practices'. Logging into every > server using a /24 is looking for needle in a haystack. > *heh* And yet just last year, Yahoo was loudly dennounced for keeping logs that allowed the Chinese government to imprison political dissidents. Talk about damned if you do, damned if don't... I guess logs should only be kept as long as they can only be used for good, and not evil? Matt > -Ray
Re: cooling door
On 3/29/08, Alex Pilosov <[EMAIL PROTECTED]> wrote: > > Can someone please, pretty please with sugar on top, explain the point > behind high power density? > > Raw real estate is cheap (basically, nearly free). Increasing power > density per sqft will *not* decrease cost, beyond 100W/sqft, the real > estate costs are a tiny portion of total cost. Moving enough air to cool > 400 (or, in your case, 2000) watts per square foot is *hard*. > > I've started to recently price things as "cost per square amp". (That is, > 1A power, conditioned, delivered to the customer rack and cooled). Space > is really irrelevant - to me, as colo provider, whether I have 100A going > into a single rack or 5 racks, is irrelevant. In fact, my *costs* > (including real estate) are likely to be lower when the load is spread > over 5 racks. Similarly, to a customer, all they care about is getting > their gear online, and can care less whether it needs to be in 1 rack or > in 5 racks. > > To rephrase vijay, "what is the problem being solved"? I have not yet found a way to split the ~10kw power/cooling demand of a T1600 across 5 racks. Yes, when I want to put a pair of them into an exchange point, I can lease 10 racks, put T1600s in two of them, and leave the other 8 empty; but that hasn't helped either me the customer or the exchange point provider; they've had to burn more real estate for empty racks that can never be filled, I'm paying for floor space in my cage that I'm probably going to end up using for storage rather than just have it go to waste, and we still have the problem of two very hot spots that need relatively 'point' cooling solutions. There are very specific cases where high density power and cooling cannot simply be spread out over more space; thus, research into areas like this is still very valuable. Matt
Re: Yahoo! clue (Slightly OT: Spiders)
On 3/30/07, Zach White <[EMAIL PROTECTED]> wrote: On Thu, Mar 29, 2007 at 10:17:50AM -0400, Kradorex Xeron wrote: > Another problem is that the Yahoo/Inktomi search robots do not stop if no site > is present at that address, Thus, someone could register a DNS name and have > a site set on it temporarily, just enough time for Yahoo/Inktomi's bots to > notice it, then redirect it thereafter to any internet host's address and the > bots would proceed to that host and access them over and over in succession, > wasting bandwidth of both the user end (Which in most cases is being > monitored and is limited, sometimes highly by the ISP), and the bot's end > wasted time that could have been used spidering other sites. It's not limited to that. I bought this domain which had previously been in use. I've owned the domain for over 5 years, but I still get requests for pages that I've never had up. <[EMAIL PROTECTED]:/var/www/logs:8>$ grep ' 404 ' access_log | grep darkstar.frop.org | awk '/Yahoo/ { print $8 }' | wc -l 830 <[EMAIL PROTECTED]:/var/www/logs:9>$ grep ' 404 ' access_log | grep darkstar.frop.org | awk '/Yahoo/ { print $8 }' | sort -u | wc -l 82 That's 82 unique URLs that have been returning a 404 for over 5 years. That log file was last rotated 2006 Sep 26. That's averaging 138 requests per month for pages that don't exist on that one domain alone. How many bogus requests are they sending each month, and what can we do to stop them? (The first person to say something involving robots.txt gets a cookie made with pickle juice.) Sure, on my domain alone that's not a big deal. It hasn't cost me any money that I'm aware of, and it hasn't caused any trouble. However, it is annoying, and at some point it becomes a little ridiculous. Can anyone that runs a large web server farm weigh in on these sorts of requests? Has this annoyance multiplied over thousands of domains and IPs caused you problems? Increased bandwidth costs? -Zach Speaking purely for myself, and not for any other organization, I would wonder what level of response you had gotten from the abuse address listed in the requesting netblock: [EMAIL PROTECTED]:/home/mrtg/archive> whois -h whois.ra.net 74.6.0.0/16 route: 74.6.0.0/16 descr: YST origin: AS14778 remarks:Send abuse mail to [EMAIL PROTECTED] mnt-by: MAINT-AS7280 source: RADB [EMAIL PROTECTED]:/home/mrtg/archive> First line of inquiry in my mind would be to use the slurp@ email, and work my way along from there. Matt
Re: Need BGP clueful contact at Global Crossing
On 12/14/06, Lasher, Donn <[EMAIL PROTECTED]> wrote: On 14 Dec 2006 09:47:46 -0500, Michael A. Patton <[EMAIL PROTECTED]> wrote: >> If there are any BGP clueful contacts at Global Crossing listening (or >> if someone listening wants to forward this to them :-), I would >> appreciate your getting in touch. >Out of curiousity, why do you think anyone here on NANOG would be willing to bother the >clueful contacts they know at provider (X) based on an email like this? It's absolutely >content-free. Having been on both sides of an issue like this one, I'd much rather see polite requests like the original requestor, rather than a 10 page dump on why provider X is severely borked. Good netiquette, seems to me. 10 page dump is excessive; but a one or two line "I'm seeing bad advertisements from AS at the following peering location" goes a long way to explain what the need and urgency is around the issue.
Re: Need BGP clueful contact at Global Crossing
On 14 Dec 2006 09:47:46 -0500, Michael A. Patton <[EMAIL PROTECTED]> wrote: If there are any BGP clueful contacts at Global Crossing listening (or if someone listening wants to forward this to them :-), I would appreciate your getting in touch. Out of curiousity, why do you think anyone here on NANOG would be willing to bother the clueful contacts they know at provider (X) based on an email like this? It's absolutely content-free. Now, if you included examples of BGP announcements that were being leaked that shouldn't be, or prefixes of yours that they were accidentally hijacking, or traceroutes going from San Jose to Paris and then back to Palo Alto within their network, or some other level of operationally interesting content, then it's much more likely the issue would be passed along either via forwarding the email, or, if the issue was sufficiently interesting, via a more immediate channel (cell phone/IM/IRC/smoke signal/INOC-DBA phone/etc). But as it currently stands, my view of Global Crossing's network doesn't show any problems worth contacting them about, so I'm unlikely to pass along your request. For all I know, you might really be a terrorist out to collapse their infrastructure by sleep depriving their backbone engineers night after night with inane requests until their REM-deprived brains fat-finger the router configs into oblivion. And that just wouldn't be good. So. How about trying again, but with relevant content that indicates an operational issue with their network, and then we can pass that along to the right folks who can look into it. Thanks! Matt (not now, nor ever have been affiliated with 3549, in case there's any possibility of confusion)
Re: comcast routing issue question
On 11/29/06, Jim Popovitch <[EMAIL PROTECTED]> wrote: On Thu, 2006-11-30 at 00:06 -0500, Jim Popovitch wrote: > Question: What could cause the first trace below to succeed, but the > second trace to fail? > > $ mtr 69.61.40.35 > HOST: blueLoss% Snt Last Avg Best Wrst > 1. 192.168.3.1 0.0% 14.3 4.3 4.3 4.3 > 2. 73.62.48.10.0% 1 10.6 10.6 10.6 10.6 > 3. 68.86.108.25 0.0% 1 11.4 11.4 11.4 11.4 > 4. 68.86.106.54 0.0% 19.8 9.8 9.8 9.8 > 5. 68.86.106.9 0.0% 1 20.5 20.5 20.5 20.5 > 6. 68.86.90.121 0.0% 1 11.3 11.3 11.3 11.3 > 7. 68.86.84.70 0.0% 1 27.7 27.7 27.7 27.7 > 8. 64.213.76.77 0.0% 1 24.5 24.5 24.5 24.5 > 9. 208.50.254.1500.0% 1 39.4 39.4 39.4 39.4 > 10. 208.49.83.237 0.0% 1 46.6 46.6 46.6 46.6 > 11. 208.49.83.234 0.0% 1 40.7 40.7 40.7 40.7 > 12. 69.61.40.35 0.0% 1 43.9 43.9 43.9 43.9 > > $ mtr 69.61.40.34 > HOST: blueLoss% Snt Last Avg Best Wrst > 1. 192.168.3.1 0.0% 11.1 1.1 1.1 1.1 > 2. 73.62.48.10.0% 19.9 9.9 9.9 9.9 > 3. 68.86.108.25 0.0% 19.3 9.3 9.3 9.3 > 4. 68.86.106.54 0.0% 19.6 9.6 9.6 9.6 > 5. 68.86.106.9 0.0% 19.0 9.0 9.0 9.0 > 6. 68.86.90.121 0.0% 1 18.2 18.2 18.2 18.2 > 7. 68.86.84.70 0.0% 1 23.9 23.9 23.9 23.9 > 8. ??? 100.0 10.0 0.0 0.0 0.0 > > > Taking the 69.61.40.33/28 subnet a bit further, .36 drops at 68.86.84.70 > but .37 - .39 make it. .40 drops at 68.86.84.70, but .41 makes it. > > Crazy. Btw, the problem has now been resolved, however I'm still curious as to what scenario could have caused that. -Jim P. eBGP multihop peering across a pair of 10 gigE links with static routes pointing to the remote router loopback; one link goes south, but the interface still shows as up/up, and voila, depending upon the hash, your packets may go across the good link, or they may disappear into the black hole of oblivion. This is why multipath is a good thing, and eBGP multihop with static routes is a Bad Thing(tm). Matt
Re: link between Sprint and Level3 Networks is down in Chicago
On 11/9/06, Deepak Jain <[EMAIL PROTECTED]> wrote: Does someone know if this is a *single* link down?? It seems bizarre to me that there would only be a single link (geographically) between those two. Whatever happened to redundancy? Deepak From the outside, this appeared to be more like a CEF consistency sort of thing; routes were still carrying packets to the interconnect, but the packets were not successfully making it across the interconnect. I would hazard a guess that had the link truly gone down in the classic sense, BGP would have done the more proper thing, and found a different path for the routes to propagate along. Again, this is speculation from the outside, based on the path packets were taking before dropping on the floor. Matt Dennis Dayman wrote: > We received confirmation from Time Warner. The link between Sprint and > Level3 Networks is down in Chicago. This has been an issue since 3:10 PM > EST. Time Warner has a ticket open to address the issue. Not sure what it > is yet. > > -Dennis > > >
Re: SprintLink peering issue in Chicago?
On 11/9/06, Olsen, Jason <[EMAIL PROTECTED]> wrote: At around 1345 Central it was brought to my attention that we had lost access to a number of websites out on the 'net... Two big-name examples are Oracle, which has our development team screaming for my blood. The other that's come to light as well is, of course, Yahoo... which means the rest of the userbase hates me. Traceroutes like the two below for Oracle generally die after one of Sprint's routers or its peer with Level3. I've already opened a case with SprintLink's broadband group and the tech I've spoken to said that there have been an influx of calls about routing/website availability problems, but nothing had been identified inside Sprint yet. Just curious if anybody else is seeing this sort of action. Sprint has been made aware of the issue, as has Level3. Matt [EMAIL PROTECTED] [/export/home/jolsen] $ traceroute www.oracle.com traceroute: Warning: Multiple interfaces found; using 10.2.2.230 @ ce0 traceroute to www.oracle.com (141.146.8.66), 30 hops max, 40 byte packets 1 core2-vlan1.obt.devry.edu (10.2.2.1) 0.407 ms 0.278 ms 0.265 ms 2 obtfw-virtual.obt.devry.edu (10.2.1.10) 1.413 ms 2.380 ms 2.400 ms 3 * * 205.240.70.2 (205.240.70.2) 5.209 ms 4 * * sl-gw32-chi-6-0-ts3.sprintlink.net (144.232.205.237) 10.738 ms 5 * sl-bb21-chi-4-2.sprintlink.net (144.232.26.33) 14.616 ms 32.739 ms 6 sl-bb20-chi-14-0.sprintlink.net (144.232.26.1) 16.901 ms 33.400 ms 27.028 ms 7 sl-st20-chi-12-0.sprintlink.net (144.232.8.219) 42.269 ms 6.190 ms 3.835 ms 8 * 209.0.225.21 (209.0.225.21) 9.971 ms 148.152 ms 9 * * * 10 * * * 11 * * * 12 * * * -- and -- [EMAIL PROTECTED] [/usr/local/sbin] # ./tcptraceroute www.oracle.com 80 Selected device ge0, address 10.2.2.4 for outgoing packets Tracing the path to www.oracle.com (141.146.8.66) on TCP port 80 (http), 30 hops max 1 10.2.2.1 (10.2.2.1) 0.289 ms 0.224 ms 0.208 ms 2 10.2.1.10 (10.2.1.10) 1.547 ms 1.502 ms 1.218 ms 3 205.240.70.2 (205.240.70.2) 2.555 ms 5.551 ms 6.408 ms 4 sl-gw32-chi-6-0-ts3.sprintlink.net (144.232.205.237) 4.120 ms 8.185 ms 6.024 ms 5 sl-bb21-chi-4-2.sprintlink.net (144.232.26.33) 5.470 ms 3.884 ms 6.889 ms 6 sl-bb20-chi-14-0.sprintlink.net (144.232.26.1) 8.851 ms 7.624 ms 5.671 ms 7 sl-st20-chi-12-0.sprintlink.net (144.232.8.219) 7.913 ms 7.283 ms 7.427 ms 8 209.0.225.21 (209.0.225.21) 4.730 ms 6.033 ms 7.925 ms 9 * * * 10 * * * 11 * * *
Re: UUNET issues?
On 11/4/06, Randy Bush <[EMAIL PROTECTED]> wrote: Chris L. Morrow wrote: > "Could you be any less descriptive of the problem you are seeing?" the internet is broken. anyone know why? Because we didn't deploy IPv6 quickly enough? ;P Matt
Re: Yahoo Postmaster contact, please
On 11/3/06, Matt Clauson <[EMAIL PROTECTED]> wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Greetings, NANOGers. I've got a mail cluster that's been spooling about 5 messages for the past week or so (with very little drain and traffic passing), and my mail admin reports that attempted contacts to the Yahoo Postmaster are not getting answered. Can someone over there drop me a line off-list, please? - --mec Amusingly enough, gmail tossed this in my spam folder, so I didn't see it until people started replying to it. I have no idea if that's indicative of anything with respect to Yahoo or not, but it might indicate a possible reason for mail deferral from some sites. If you're having network connectivity issues reaching Yahoo, NANOG would seem like a reasonable place to raise questions--but this isn't really a list for mail admins to hang out on. It looks like network connectivity between dotorg.org and Yahoo is good, so I'm not sure if there's anything people on this list could help you with--but if you do have network connectivity issues in the future, there's definitely people here who can address those concerns. Matt -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2 (MingW32) Comment: GnuPT 2.7.2 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFFS6w9vDNtj3aXDYkRAu8hAJkBl7fcSpXG1p0nU9QsWHReHfQsKwCdFj20 LrLTe2HcgNremAEoYIp983Y= =+e8X -END PGP SIGNATURE-
Re: WSJ: Big tech firms seeking power
On 6/14/06, Sean Donelan <[EMAIL PROTECTED]> wrote: Since power consumption was a topic at the last NANOG meeting. subscription required, or buy a copy of the Wall Street Journal from a newstand http://online.wsj.com/article/SB115016534015978590.html Surge in Internet Use, Energy Costs Has Big Tech Firms Seeking Power By KEVIN J. DELANEY and REBECCA SMITH Wall Street Journal June 13, 2006; Page A1 With both Internet services and power costs soaring, big technology companies are scouring the nation to secure enough of the cheap electricity that is vital to their growth. The search is being led by companies including Microsoft Corp., Yahoo Inc. and IAC/InterActiveCorp. Big Internet firms have been adding thousands of computer servers to data centers to handle heavy customer use of their services, including ambitious new offerings such as online video. [...] And, just to be fair, Google gets their own bit of news on the power front: http://www.iht.com/articles/2006/06/13/business/search.php I wonder just how much power it takes to cool 450,000 servers. Matt
Re: 2006.06.07 NANOG-NOTES TCP Anycast--don't spread the FUD!
On 6/12/06, Rodrick Brown <[EMAIL PROTECTED]> wrote: Looks like this document maybe have been removed? the link appears to be dead any mirrors? The slide deck hadn't been put online when I sent my notes; I took a guess at what the location might end up being, but guessed wrong. The actual location ended up being http://www.nanog.org/mtg-0606/pdf/levine.pdf Matt -- Rodrick R. Brown Senior Systems Engineer http://www.rodrickbrown.com http://groups.yahoo.com/group/wallstandtech
Re: 2006.06.07 NANOG-NOTES Smart Network Data Services
On 6/9/06, Simon Waters <[EMAIL PROTECTED]> wrote: On Friday 09 Jun 2006 12:22, Matthew Petach wrote: > SNDS tomorrow > Usability The sign-up process is very painful. Microsoft Passports really aren't appropriate for business accounts, my employer don't have a mothers maiden name, or a first pet. At one point it claimed the name of my first pet must have more than 5 characters in it ? (Perhaps they should aim for things likely to have more information in them, besides my mothers maiden name has been published in the newspapers). I sent a request for help, as the process fell over at the stage of authorising the first address range I requested. With a failure to handle the URL sent for me to click. Interesting--it's good for me to hear what people are saying about it, as I can't access it myself--my MSN accounts were all locked, and part of the termination agreement stipulated that I'm forbidden from accessing their services. It does mean the service is limiting its own scope by requiring Passport-based logins like that, as I'll never be able to use it to see if any of the domains/netblocks I'm responsible for might be originating spam. Perhaps if Microsoft is truly interested in helping clean up the Internet, they might lift the Passport login requirement? Matt [tempted to set Reply-To: to [EMAIL PROTECTED], but that might be considered antisocial. ^_^ ]
Re: 2006.06.06 NANOG-NOTES MPLS TE tutorial
On 6/8/06, Matthew Petach <[EMAIL PROTECTED]> wrote: (still here, just been really busy at work today; will try to finish sending the notes out tonight. --Matt) 2006.06.06 MPLS TE tutorial Pete Templin, Nextlink Gyah!! Huge apologies to Pete, who really works for Texlink. I used to work at Nextlink, and in taking notes, my fingers went down their old familiar path a bit too easily. Again, Pete Templin works at Texlink, not Nextlink--apologies for that gaffe, Pete. ^_^;; Matt
2006.06.07 NANOG-NOTES DNSSEC bootstrapping with DLV
(last notes from NANOG37, yay! I definitely fell further behind this time around than in Dallas. Unfortunately, I don't think I'll be allowed to go to St. Louis, so I probably won't be able to provide notes for NANOG38. --Matt) 2006.06.07 Deploying DNSSEC--bootstrap yourself Joao Damas, ISC [notes are at http://www.nanog.org/mtg-0606/pdf/joao-damas.pdf DNSSEC status standard is complete and usable some minor nits with regards to some privacy issues 2 implementations: NSD, BIND at least one DNSSEC aware resolver (BIND 9.3.2 and later) Really, you just need some data. DNSSEC follows a hierarchical model for signatures. sign the root zone get root zone to delegation sign TLDs get TLDs to delegation-sign SLDs, etc. Today, the root zone remains unsigned likely will be this way for some time Very few TLDs have signed their zones and offer delegation signatures .se, .ru, .org DNSSEC provides for local trust anchors you can use trust-anchors clause in BIND problem: if you have too many, it becomes a nightmare to maintain, so it doesn't get used. very manual process Enter DLV, domain lookaside validation it's an implementation feature, not a change to the protocol; matter of local policy enables access to a remote, signed repository of trust anchors, via the DNS implemented in BINDs resolver so far more to follow? unfortunately, requires you to trust remote repository DLV lookup a DLV enabled resolver will try to find a secure entry point using regular DNSSEC; only if it fails is DLV used, if it is configured. [picture of DLV lookup chain] On resolver (BIND) add to named.conf in the options section //DNSSEC conifg dnnssec-enable yes dnssec-lookaside . trust-anchor dlv.isc.org.; get the key from ISC's web: http://www.isc.org/ops/dlv ISC is operating a DLV registry free of charge for anone who wants to secure their DNS Likely some closed orgs will use their own (eg mil) have a look, start using it! Any questions? Q: Mark Kosters, Verisign: Any plans to configure DLV registries per TLD? A: BIND code only allows for one right now. Q: Would be good to allow it to be configured per TLD. Q: Randy Bush, IIJ: some feeling or understanding how IANA, root would validate keys/zones it has keys for; don't understand how ISC proposes to validate keys it would be storing. He suggests they publish the security policy. A: In case of registrars proxying keys; they trust registrar. Otherwise, it's like PGP; show me your face, show me your key. Q: Paul vixie, ISC, following up on Mark Kosters; you can only have one DLV for any point in the namespace; you can specify a different one for a TLD than root; that allows a TLD DLV to be paranoid, like .mil. who doesn't want to trust anyone else with key information. If every TLD wanted to do that, they would find high levels of cut-and-paste fatigue, so ISC will operate a root level DLV server as well. Q: Rick Wesson, runs Alice's Registry, a small registrar. he's considering doing this, he can help DNS holders register their keys if people are interested, and will help get them into the DLV tree. Q: Sam Wiler?, Sparta: concerns from Randy about how ISC will authenticate the entries. Registrars should consider running their own DLV servers, as they have the relationship with the domain holder. Code? Apparently you don't need code... NANOG 37, ending slides. 425 attendees, 118 first timers lots of countries most USA, 11 canada, scattered others. ISP, then NSP, then other categories. top 3 companies represented: Cisco, Juniper, Equinix HUGE thanks to Rodney Joffe and Neustar for puling off a miracle to make this happen at the last minute! Thanks to sponsors, bear, gear, other. Susan R Harris, many thanks to her for all the work she has put in over the years and to make this happen! Also huge thanks to all the other people at Merit And we'll see you in St. Luis, Oct 8-10th, joint meeting with ARIN, things set in stone. Network will go down in 30 minutes or so--pack up and go home! :) I think that was the fastest closing I've seen at a NANOG yet. ^_^;;
2006.06.07 NANOG-NOTES TCP Anycast--don't spread the FUD!
(this was one of the coolest talks from the three days, actually, and has gotten me *really* jazzed about some cool stuff we can do internally. Huge props to Matt, Barrett, and Todd for putting this together!! --Matt) 2006.06.07 TCP anycast, Matt Levine, Barrett Lyon with thanks to Todd Underwood TCP anycast, don't believe the FUD Todd Underwood is in Chicago Barrett Lyon starts off. [slides may eventually be at: http://www.nanog.org/mtg-0606/pdf/tcp-anycast.pdf IPv4 anycast from a network perspective, nothing special just another route with multiple next-hops services exist on each next-hop, and respond from the anycast IP address. It's the packets, stupid perceived problem: TCP and anycast don't play together for long-lived flows. eg, high-def porn downloads [do porn streams need to last more than 2 minutes?] some claim it exists, and works... yes, been in production for years now. Anycast at CacheFly deployed in 2002 prefix announced on 3 continents 3 POPs in US 5 common carriers (transit) + peering be sensible to who you peer with Effective BGP communities from upstreams is key keep traffic where you want it. Proxy Anycast proxy traffic is easy to anycast! move HTTP traffic through proxy servers. customers are isolated on a VIP/virtual address, which happens to exist in every datacenter. Virtual address lives over common carriers allowing even distribution of traffic state is accomplished with custom hardware to keep state information synchronized across proxies. Node geography anycast nodes that do not keep state must be geographically separated Coasts and countries work really well for keeping route instability largely isolated. Nodes that are near by could possibly require state between them if local routes are unstable. IP utilization "Anycast is wasteful" people use /24's as their service blocks; use 1 /32 out of a whole /24. Really? How much IP space do you need to advertise from 4 sites via unicast? Carriers and Peering for content players, having even peering and carriers is key. you may cause EU eyeballs to go to CA if you're not careful with where you peer with people. having an EU centric transit provider in the US without having the same routes in EU could cause EU traffic to home in the US Use quality global providers to keep traffic balanced. When peering... keep in mind a peer may isolate traffic to a specific anycast node Try to peer with networks where it makes sense; don't advertise your anycast to them where they don't have eyeballs! Try to make sure your peers and transit providers know your communities and what you're trying to do, and make sure you understand their communities well! Benefits of Anycast. for content players moving traffic without major impact or DNS lag provides buffers for major failures allows for simplistic traffic management, with a major (potential) performance upside. it's BGP you don't control, though, so not much you can do to adjust inbound wins. HTTP has significant cost to using DNS to try to shift traffic around; six or more DNS lookups to acquire content; anycast trims those DNS lookups down significantly! Ability to interface tools to traffic management. No TTL issues! Data, May 9, 2006 Renesys: monitored changes in atomic-aggregator for a CacheFly anycast prefix AS path changes and pop changes Keynote: monitored availability/performance of 30k file Revision3: monitored behavour of "longlived" downloads of DiggNation videocast--over 7TB transferred. Renesys data: 130BGP updates for may 9th; low volume day stable prefixes 34 distinct POP changes based on atomic aggregator property on prefixes 130 updates is considered a stable prefix. SJC issue: thirty-five minute window, 0700 to 0735 UTC, saw: 98 updates, 20 actual pop changes based on atomic aggregator changes, all from one san jose provider, fail from SJC to CHI back to SJC unable to correlate these shifts with any traffic changes; mostly likely we don't have a big enough sample size. possibly just not a lot of people using those routes. BGP seems stable--what about TCP flows? AVG time between SJC and CHI and back again was about 20 seconds; very quick on the trigger to go back to SJC; would break all TCP sessions happening at the time. For the most part, TCP seems stable. Keynote: 30k download from 31 locations every 5 minutes, or average of 1 poll per 9.6 seconds compared against 'Keynote Business 40' data collected on May 9, 2006 represents short-lived TCP flows, though. Orange line is Keynote business 40 pegged 100% availability load time was lower than the business 40. (0.2s vs 0.7s for business 40) Revsion 3 data monitored IPTV downloads for 24 hours (thanks, jay!) span port; analyzed packet captures look for new TCP sessions not beginning with SYN compare that against global active connection table. looked for sessions that appeared out of nowhere. Long-lived data 683,204 TCP sessions. anything less than 10 minutes thrown out 23,795 sessions las
2006.06.07 NANOG-NOTES Anycast benefits for k root server
Break ends at 11:40, PGP signing will take place, and don't forget to fill out servers. ANYCAST fun for the final sessions. Lorenzo Colitti, RIPE NCC [slides are at: http://www.nanog.org/mtg-0606/pdf/lorenzo-colitti.pdf Agenda: introduction latency client-side server-side Benefit of individual nodes Stability Routing issues Why anycast? root server anycast widely deployed c, f, i j, k, m at least reasons for anycasting provide resiliency: eg contain DOS attacks spread server and network load increase performance but is it effective? measure latency ideally for every given client, BGP should chose node with lowest RTT. does it? from every client, measure RTTs to anycast IP address service interfaces of global nodes (not anycasted) for every client, compare K RTT to RTT of closest global node a = RTTk/min(RTTi) if 1, BGP is picking right node if > 1, BGP picks the wrong node if <1, seeing local node. Latency with TTM: methodology DNS queries from ~100 TTM test boxes dig hostname.bind see which host answers extract RTT take min of 5 queries check paths to service interfaces; is it same as prod IP according to RIS, mostly 'yes' TTM probe locations, mostly in europe Latency with TTM: results (5 nodes) most values are close to one; generally BGP doing pretty good job. from 2 nodes to 5 nodes (2 nodes, April 2005) (5 nodes, April 2006) mostly same results, clustered around one, whether 2 or 5 nodes. consistency of 'a' over time average of that over time. TT103 is outlier calculated over time, threw out that one outlier. results are pretty consistent. average is little higher than one, mostly consistent over time measuring from servers TTM latency measurements not optimal locations biased towards europe limited number of probes (~100) don't reflect k client distribution how to fix? ping clients from servers much larger dataset methodology process packet traces on k global nodes extract list of client IP addreses ping all addresses from all global nodes plot distribution of 'a' 6 hours of data 246,769,005 queries 845,328 unique IP addresses CDF of 'a' seen from servers results not as good as seen by TTM only 50% of clients have a = 1 about 10% are 4x slower/farther. probably due to TTM clustering in europe latency conclusions 5 node result vs 2 node, comparable, at least in TTM non-TTM results not so rosy. How many nodes are needed--is 5 enough? evaluate existing instances how to measure benefit of an instance? Assume optimal instance selection that is, every client sees closest instance this is upper benefit of benefit consistent to see if we've reached diminishing returns for every client, see how much its performance if the chosen node didn't exist. B is loss factor, how much a client would suffer if an instance were knocked out B = RTTknockout/RTT... Graph for LINX; 90% of clients wouldn't see an impact if it went away; 10% would see a worsening. geographic distribution pretty wide AMS-IX about 20% would suffer performance degregation; busiest two nodes, see a lot of clients, important to k deployment. If they plot it for both LINX and AMSIX together, about 65% wouldn't be affected, most of others would see 4x, 10% would be 7x worse. So taken together, the *two* nodes are important. Tokyo; best node for few clients; but those served, BADLY served by others; about 10% who would go more than 7x if it went way, those clients mostly Asia. Miami node at NOTA, moderate benefit for some clients, US and southAm would be badly served by europe or Tokyo. Delhi node is mostly ineffective, most would be served better by other nodes. Condense the graph into one number to get a value for effectiveness of each node. weighted average of B for each client. if benefit value is 1, node doesn't provide any benefit at all. larger numbers show higher benefits. Europe, when taken together, high benefit, as is Tokyo; Miami node not so effective, and Delhi is nearly ineffective. Does anycast provide any value then? knock out all except LINX; dark red curve (pre 1997) 10% wouldn't notice, 85% would get worse, benefit value is 18.8, so anycast does bring value. Stability the more routes competing in BGP with more nodes doesn't matter for single packet UDP exchanges does matter for TCP Look at node switches that occur. collect packet dumps on each node. extract all 53/UDP traffic k nodes only NTP synchronized if IP shows up on two nodes, log a switch. 5 nodes, april 2006, 0.06% saw switches 2830 switchers out of 845,328, 0.33% switchers no big issue with instance switchers. Routing issues k-root structure 5 global nodes (prepended) linkx, amsix, tokyo, mia, del different prepending values no-export causing reachability TT103 has value of 200, the graph axis is cut. tt103 is in Yokohama; Tokyo is 2ms away; but the query goes to Delhi through Tokyo to LA. 416ms vs 2, so value is 208. Thanks to Matsuzaki and Randy Bush, got BGP paths from AS2497 bad interaction of different prepending lengths need
2006.06.07 NANOG-NOTES Lightning talk notes
(I think these were the toughest to take notes on, since they went by so fast; took the most cleaning up afterwards. But they were also the best talks of the 3 days. I wish we could have flipped, and taken more time on Tuesday for them so we really could have dug in and asked the questions we were itching to ask. ^_^; --Matt) 2006.06.07 Lightning talks Marty Hannigan, Renesys: [slides are at: http://www.nanog.org/mtg-0606/pdf/lightning-talks/1-hannigan.pdf Critical infrastructure, root server location analysis Where to stick your servers. :) he took some public info out there on root-servers.org talked to some people, extrapolated from larger set of data. operator demographics. in US: 3 corp a, c, j 2 edu b and d 1 mil g 2 research e/h 3 nonprofit f, i, l autonomica is responsible for l, but hosts "some" instances on a CDN; CDN is a US formed entity in EU: 1 non profit k asia/japan: 1 nonprofit m 92% of system operated in US, 8% non-us; 5% margin of error +-. US entity type non-us 8% us corp 39% us mil 23% us edu 15% us nonprofit 15% where? in 54 countries all religions all methods of governance politically: 79% are democratic governments 21% in other forms of government global diversification for security and performance instances spread across continents different networks different proceedures different software different hardwware different weaknesses weaknesses become strength, since they are diverse; no one weakness knocks out all servers. little less open to insider malfeasance Global distribution NA 38% EU 35% Asia 12% AUs 8% east EU 3% LA 2% Africa 2% ANT 0% getting reasonable coverage in the world situating a root server relationship 101 who you know ICANN, operator, IX, and RIR relationships regulators how you spin it national pride performance and security betterment of user experience Threats no different from anyone else direct attacks proxy attacks botnets easy money miscreants masking other activities Not sure what motivations to attack root servers; can't extort money from nonprofits let's attack a root server target $-root location; eu hosting facility multi-post cabinet config with cabling and power under floor unlocked cabinet, single factor facility entry physical attack open cabinet door access to power hijack attempt advertise a route return bad answers network attack spoof source random host queries packet floods summary: root system is less likely to be subject to insider attack or weakness but can be attacked by layer 3 there is likely good resarch data coming across those interfaces trend towards a collapsed root system, where root and TLD share same hardware or networks should be more closely examined. slides will be up soon, talk to him in the hallway NEXT, Anton Kapela Network RTTs [slides are at: http://www.nanog.org/mtg-0606/pdf/lightning-talks/2-kapela.pdf I'm pinging 10: high rate active probes we're pinging stuff really quickly adjusted host kern.hz to 1000 select() gets pretty accurate +-1ms emmission accuracy stuff is responding Interesting 0.001% of data relates to end-to-end queuing what has been sampled? some cisco 7513s IOS 12.3 mainline linux 2.4.20 freebsd 4.8 NT4 sp6 various end-to-end paths on u-wisc network raw data isn't terrible interesting. in adaptive link layer protocols, see rate shifting manifested in RTT wireless, HPNA/HCNA, powerline ethernet 10,30,60,90 second peaks fourier transforms, wavelet transforms, frequency domain 1000 seconds at 10ms intervals break into composite, aggregate graph at top, 0-50hz span on x axis, y axis is contribution summary of entire graph. bottom right graph is rough 200 samples of a range from 0-5hz, 100pps, deduce delay at half that sampling rate. delay is not a simple boring thing; has scheduler delays, path dynamics not visible before to see queue depths. shark fins showed up; congestion events do occur, are quite measurable. when links are hot, queues are obvious, esp. on highly multiplexed links. bottom left, cubic resonance, several tens of thousands of multiplexed flows hitting odd resonance. pinging windows machine, composite spectral fingerprint; 10,20,25,30 spikes Linux fewer spikes freebsd low and flat IOS is 10, 20, 30 and grass of 1hz spacing below 10hz. win32 delay spectrum also has 1hz fuzz below 10hz. Sampled RTT and performed signal analysis of it; now what? is network time continuous? is round trip time discreet or continous? no changes in revealed as you go down lower is delay a "signal' anyway what's with the 0 hz DC component in the FT output? could this be used for fingerprinting? yes, could be like next nmap. packet-level fingerprinting is trivial to fake; but IP stack scheduler behaviour doesn't change so easily. NEXT: Mikael Abrahamsson Affect on traffic from the TPB bust with Kurtis Lindqvist [slides are at: http://www.nanog.org/mtg-0606/pdf/lightning-talks/3-abrahamsson.pdf Bittorrent background p2p protocol for filesharing. text
2006.06.07 NANOG-NOTES Issues with IPv6 multihoming
(hope the inclusion of URLs in the notes isn't making them all end up in people's spam folders... --Matt) 2006.06.07 Vince Fuller, from Cisco and Jason Schiller from UUnet [slides are at: http://www.nanog.org/mtg-0606/pdf/vince-fuller.pdf IPv6 issues routing and multihoming scalability with respect to routing issues. how we got where we are today define "locator" "endpoint-id" and their functions Explain why these concepts matter, why this separation is a good thing understand that v4 and v6 mingle these functions, and why it matters recognized exponential growth - late 1980s CLNS as IP replacement dec 1990 IETF OSI, TP4 over CLNS--edict handed down from IETF revolt against that, IP won ROAD group and the "three trucks" 1991-1992 running out of "class-B" network numbers explosive growth of "default-free" routing table eventual exhaustion of 32-bit address space two efforts -- short-term vs long-term More at "the long and winding ROAD" http://rms46.vlsm.org/1/42.html Supernetting and CIDR 1992-1993 Two efforts to fix it; CIDR, short term effort, long term effort became IPv6. IETF ipng solicitation RFC 1550 Dec 1993 Direction and technical criteria for ipng choice RFC1719 and RFC 1726, Dec 1994 proliferation of proposals TUBA == IP over CLNS NIMROD==how to deal with it from high level Lots of flaming back and forth, not much good technical work. choice eventually made on political choices, not technical merit. Things lost in shuffle...er compromise included: variable length addresses decoupling of transport and network-layer addresses clear separation of endpoint-id/locator (more later) routing aggregation/abstraction "math is hard, let's go shopping" -- solving the real issues was set aside, people focused on writing packet headers instead identity -- what's in a name think of an "endpoint-id" as the "name" of a device or protocol stack instance that is communicating over a network in the real world, this is like your "name"--who you are. a "domain name" is a human readable analogue endpoint-IDs: persistent--long term binding, stays around as long as machine is up ease of administrative assignment hierarchy along organization boundry (like DNS), not topology portable: stay the same no matter where in the hierarchy you are Globally unique! unlike human names. ^_^ Locators: "where" you are in the network think of "source" and "dest" addresses in routing and forwarding as locators real-world analogy is street addresses or phone numbers. typically some hierarchy (like address), or like historical phone number (before portability!) Desireable properties of locators: hierarchical assignment according to topology (isomorphic) dynamic, transparent renumbering without disruption unique when fully specified, but may be abstracted to reduce unwanted state variable length realworld--don't need to exact street address in Australia to fly there Possbly applied to traffic without end-system knowledge effectively like NAT, but doesn't beak end-to-end Why should I care? v4/v6 there are only "addresses" which serve as both endpoint-ids and locators this means they don't have the desirable properties of either: assignment to organizations is painful because use as locator constrains it to be topological exceptions to topology create additional global state renumbering is hard; DHCP isn't enough, sessions get distrupted, source-based filtering breaks, etc. Doesn't scale for large numbers of "provider-indep" or multihomed sites why should I care? currently, v6 is only a few hundred prefixes; won't be a problem until it really ccatches on, at which point it's too late. larger v6 space gives potentially more pain NAT is effectively id/locator split--what happens if NAT goes away in v6? scale of IP networks still very small compared to what it could grow to re-creating the routing swamp with ipv6 with longer addresses could be disasterous; not clear if internet could be saved in that case. Been ignored by IETF for 10+ years concepts have been known since 60s. Can v6 be fixed? And what is GSE, anyhow? Mike O'Dell proposed this in 1997 with 8+8/GSE keep v6 packet format, implement id/locator split http://ietfreport.isoc.org/idref/draft-ietf-ipngwg-gseaddr basic idea: separate 16-byte addrss into 8-byte EID and 8-byte routing goop/locator change TCP/UDP to only care about 8-bytes. allow routing system to muck with other 8 bytes in-flight achieves goal of EID/locator split while keeping most of IPv6, hopefully without requiring new database for EID-to-locator mapping Allows for scalable multi-homing Renumbering can be fast and painless/transparent to hosts. GSE issues problems with it--incompatible changes to TCP/UDP in 1997, no IPv6 installed base, easy to change; now, v6 deployed, is it too late to change? violation of end-to-end principle. perceived security weakness of trusting "naked" EID (steve bellovin says this is a non-issue) mapping of EID to EID+RG may add complexity to DNS depending o
2006.06.07 NANOG-NOTES Smart Network Data Services
(I'm starting to guess I'd finish sending these out faster if I stopped falling asleep on my keyboard so often... --Matt) 2006.06.07 Welcome to Wednesday morning http://www.nanog.org/ click on Evaluation Form Let us know how the M-W vs S-Tu format; next time will be S-Tu due to ARIN joint meeting, but need more feedback! Bill Woodcock, been on program committee And lightning talk people need to send their slides to Steve Feldman!! Elliot Gilliam, ISP community, notifications to Smart Network Data Services [slides are at http://www.nanog.org/mtg-0606/pdf/eliot-gillum.pdf AGENDA postmaster services SNDS problem goal today tomorrow motivation feedback/dialog questions/discussion Postmaster--starting point for any issues you have sending mail into Hotmail/MSN Live. It's like AOL skunkfeed, you can do junk mail reporting. Lets you see what bad stuff is coming from your domain. SenderID Site is at: http://postmaster.msn.com/snds/ Problem: bad stuff on the internet (spam, phishing, zombies, ID theft, DDoS) makes customers unhappy. Solution #1 -- try to stop it before it hits customers doesn't really *solve* the problem Solution #2 -- take what we learn, apply it upstream, get more bang for buck #2: #1 is too low ISP-centric efficiency solution #1, n ISPs have n-1 problems, total is O(n^2) n ISPs have 1 problem (themselves), total is O(n) reduces work of the overall system. Crux today people and ISPs are measured by how much BAD stuff they *receive* Not judged by what they send out. similar to healthcare industry no tight feedback loop to ISP behaviour nice quotes on slides http://www.circleid.com/posts/how_to_stop_spam 7 step program (like 12 step, but shorter) 1: recognize the problem: SNDS 2: believe that someone can help you : Me 3: Decide to do something : You 8: Make an inventory of those harmed : SNDS 9: Make amends to them :Tools 10: Continue to inventory : SNDS 12: Tell others about the program :You What is SNDS Website that offers free, instant access to MSN data on activity coming from your IP space data that correlates with "internet evils" informs ISP to enable local policy decisions Automated authorization mechanism uses WHOIS and rDNS users are people not companies A force multiplier attempt. You can do it on your own, no need to sign up your company officially as long as you're an rWHOIS/WHOIS contact. SNDS goal: provide info which allows ISPs to detect and fix any undesired activity. qualitative and quantitative data "No ISP left behind" stop problems upstream of the destination Bring total cost of remediation to absolute minimum keep service free Make internet a better place. We have data! Windows Live Mail/MSN Hotmail is a spam and spoofing target. 4 billion inbound mails/day 90/10 spam/ham by filtering technologies User reports on spam, fraud, etc. Inbound mail system slide--ugly to read, too dark. SNDS website slide shown. You can see daily aggregated traffic from your network; activity periods, IPs, commands and messages seen on port 25, samples of exchanges. Filter results on your mail rate at which users press "this is junk" on your mail. Trap counts for when IPs hit their junk filters. comments column is catch-all for anything else they might put in; like open proxies, when tested positive. "export to CSV" button, so you can feed the data in to your own systems if you want. Today's Scenario Illustrate magnitude and evidence of a problem. additional resources monitoring infrastructure SNDS Stats 2500 users mostly senders 67 million IPs 10-20% of inbound mail and complaints Output drops by 57% on /24+ when monitored by SNDS SNDS tomorrow Usability signup by ASN better support for upstream providers access transfer Utility programmatic access Data virus-infected emails phishing honeymonkey sample messages Expand the the coverage, try to hit more of the problems on the net. Provide sample messages, compelling evidence when facing customers This hasn't shipped yet, it's what he's hoping to have in a month or two. Tomorrow's Scenarios Lowered barrier to entry recurring "cost" ISP types end-user tier 1/2 monitoring, tier 2/3 directly attack more than just spam virus emails -> infected PCs, outbound virus filters phishing/malware hosting -> takedowns. Is asymmetric routing a sign of people trying to launch hidden abuses of the net? Looking to hit more issues, like spotting virus-laden messages; either infected, or an open relay. Hoping that automation speeds response. Safety Tools Stinger: http://vil.nai.com/vil/stinger Nessus: http://www.nessus.org/ [oy, read the list from his slide, it's long.] green items on the list are free, others are pay-for products. Pay-for isn't necessarily a bad thing if you get benefit! Safety tool breakdown from MSN on next slide. Motivation: Hypothesis: everyone benefits Customers: infected uses get fixed safer, cheaper, better internet experience ISPs solution #1 isn'
2006.06.06 Net Optics Learning Center Presents Passive Monitoring Access
(apologies, this really was just a marketing presentation in very, very thin disguise. I really want that hour of my life back. :( --Matt ) 2006.06.06 Net Optics Learning Center Presents The fundamentals of Passive Monitoring Access [slides are at: http://www.nanog.org/mtg-0606/pdf/joy-weber.pdf TAP technology--tools change, but some things stay somewhat constant--need a way to collect information. Port contention for monitoring--how many people are running into these issues? How many people use SPAN ports to get access to information? Agenda: Present an overview of Tap technology and how it makes network monitoring and security devices more effective and efficient. tap technology overview taps, port aggregators, and regen taps active response, bypass switches link aggregators and matrix switches taps with intelligence Add more intelligence, SNMP capability into remote tap systems. passive monitoring access--you should have full access to 100% of the packet data; even errors, etc. at layer 1 and layer 2. passive means without affecting traffic no latency no IP addresses no packets added, dropped, or manipulated No link failure traffic can be collected via: hubs optical taps What is zero delay? eliminates delays caused by the 10msec delay found in most taps when the tap loses power. Zero Delay means if the tap loses power no packets dropped/resent no latency introduced power loss to tap undetectable in the network Hubs are cheap and easy, get most of the info you need. The more utilization, the higher the collision rate means you're not getting all the data you need. Placing devices in-line; you get full visibility, but requires impact when you need to move monitoring tool from one place to another, or work on the tool. advantage: see all traffic including layer 1 and 2 errs preserve full duplex links SPAN ports--gain access to data, internal to a switch; good for data internal to switch fabric. But you lose layer1 and layer2 errs; not so bad for security tools, but for network debugging, horrible. Only supports seeing data flowing through a single switch. fights over who gets access to the port for tools. Test Access Ports (TAP) designed to duplicate traffic for monitoring devices. You put it inline once, it's inline, passive. preserves full duplex links, device neutral, can be installed between any 2 devices. remains passive no failure point introduced fiber taps don't even require power. always need to fail through, no interruption. creates a permanent access port to the data stream. copper and fiber handled differently; copper has a retransmit system to replicate the information; fiber, just splits photon streams. Two output ports, only transmitting data; no way to send data back through. No way to introduce errors. Different types: single tap: duplicates link traffic for a monitoring device regeneration tap: duplicates link traffic for multiple monitoring devices link aggregator tap: combines traffic from multiple links matrix switches: offer software-control access to multiple links other tap options: built-in media conversion--use mismatched interfaces without separate media converter active response--inject responses back into the link. converter taps serve two purposes--connect dissimilar interfaces without media converter. but usually don't fail through cleanly. Active response is generally in the security arena. sends back to both sides. Copper tap devices 10/100baseT 10/100/1000baseT triple speed 1000baseT normal gig tap Need TWO monitoring NICs to see full duplex data, since you get TWO TX links coming at you. Try to get triple speed TAP with dip switch speed/flow setting, rather than trying to autosense. Fiber taps gigabit SX/LX/SZ, 10gig SR/LR/ER (multimode and single mode) still has 2 TX outputs. topology, and split ratio split ratio is amount of light going to each port. split ratio--amount of light you're willing to tolerate giving up on the network port. Basically, work up a Loss Power budget for the link, figure out how much you can afford to lose before you lose link. Need to make sure that there will be no impact for either end! Do you take distance between the monitoring device and the tap output device? Yes, try to keep within the reduced power budget available off the monitor port, usually about 10 meters should be fine. Can you re-use optical taps for OC12 ATM as well as gigE or 10gigE? will be specific for multimode vs single mode, if you stay at 50/50, generally not a problem. Converter taps are generally powered. the primary path is passive, but the monitoring port has to be active to support the media conversion. Port aggregator taps full duplex link being tapped, aggregating out a single link so you don't need 2 NICs to capture the TX data. can also make a port a full duplex, 2 way active/passive port in newer models. what about multiple output ports? allow passive access for multiple monitoring devices to a single throug
2006.06.06 NANOG-NOTES MPLS TE tutorial
(still here, just been really busy at work today; will try to finish sending the notes out tonight. --Matt) 2006.06.06 MPLS TE tutorial Pete Templin, Nextlink [slides are at: http://www.nanog.org/mtg-0606/pdf/pete-templin.pdf http://www.nanog.org/mtg-0606/pdf/pete-templin-exercise.pdf He works in a Cisco shop, no JunOs experience Operator perspective, no logos Traffic engineering before MPLS --the "fish" problem. two parallel paths, one entry router, one exit router, you end up with all traffic taking one path, not using the other path. IGP metric adjustments can lead to routing loops hard to split traffic No redundancy left over if both paths filled, but can be good for using 2 out of 3 paths. MPLS TE fundamentals Packets are forwarded based on FIB or LFIB FIB/LFIBS built based on RIB TE tunnels; TE tunnel interface is a unidirectional logical link from one router to another. Once the tunnel is configured, a label is assigned for the tunnel that corresponds to the path through the MPLS network (LSP) TE tunnel basics Once traffic is routed onto the tunnel, the traffic flows through the tunnel based on the path. Return traffic could be placed onto a tunnel going the opposite direction, or simply routed by IGP Key terms for TE Headend router on which the tunnel is configured Tail destination address of tunnel Midpoint router(s) along the path along the tunnel LSP Basic TE config Global: mpls traffic-eng tunnels IGP: must be OSPF or IS-IS mpls traffic-eng rouer-id Loopback0 mpls traffic-eng physical interfaces mpls ip mpls traffic-eng tunnels tells IGP to share TE info with other TE nodes interface TunnelX ip unnumbered loopback0 borrow the loopbak's address so we can forward traffic down the tunnel tunnel mode mpls traffic-eng tunnel destination tunnel tail tunnel mpls traffic-eng path-option 10 dynamic find a dynamic path through network best path with sufficient bandwidth will discuss path selection in a bit Where are we at? Tunnels go from headend to tail end through midpoint routers over a deterministic path we know what commands go on a router for the global physical interface tunnel interface commands TE and bandwidth Physical interfaces can be told how much bandwidth can be reserved (used) ip rsvp bandwidth X X TE tunenls can be configured with how much bandwidth they need: tun mpls traff bandw Y Tunnels will reserve Y bw on outbound interfaces, and find a path across the network wth X(unused)>Y BW. This prevents oversubscription, or at least helps control it. You can allow for burst room, but for now we'll stick with static, non-oversubscribed links. TE BW operators can adjust the tunnel bandwidth values over time to match changes in traffic. If tunnels are dynamically placed, the tunnels will dynamically find a path through the network with sufficient bandwidth, or will go down. TE auto-bandwidth magic Tunnels can be configured to watch their actual traffic as in "shw int | inc rate" every five minutes, and update their reservation to match, at periodic intervals. Dynamic reservations to match the live network Bandwidth is 'reserved' using RSVP but not "saved" for TE Often buys enough time to identify the surge, see where the traffic is coming/going. The number is only a number in control plane; no actual impact on data plane, no shaping, no control on real data flows. tunnel mpls traffic-eng auto-bw frequency Y each auto-bw tunnel does "sh int" to capture its rate every 300* seconds each auto-bw tunnel updates "tunn mpls traff bandwidth X" every Y seconds The config actually changes; this will impact your RANCID tracking. It uses highest measured rate during the interval Y May want to tweak your load-interval, since it's a decaying function over time; 5 minute is a fairly smooth value. May need to tweak config check-in system to avoid getting flooded with bandwidth adjustments. Covered: TE tunnel basics router config basics general concepts about TE and bandwidth In this case, the shortest path that has X BW available for reservation actually, bw X at or below priority Y, but that's later. SPF calculations step 0: create a PATH list and a TENT list step 1: put "self" on PATH list. step 2: step 3: put PATH nodes' neighbors on TENT list step 4: if TENT list is empty, stop. step 5: jump back to step 2: Example exercise -- calculate router A's best path to router D using the handout. CSPF notes No load sharing is performed within a tunnel; as soon as a path is found, it wins CSPF tiebreakers: lowest IGP cost largest minimum available bandwidth lowest hop count top node on the PATH list Creating paths -- can be created dynamically, or statically via static paths. Dynamic: tunnel mpls traff path-option X dynamic Explicit paths paths can be crated manually by explicitly creating a path "ip explicit-path name " next-address X next-address Y tunnel mpls traff path-option X explicit name blah Paths can be created manually by explicity configuring a p
2006.06.06 NANOG-NOTES IDC power and cooling panel
(ok, one more set of notes and then off to sit in traffic for an hour on the way to work... --Matt) 2006.06.06 Power and Cooling panel Dan Golding, Tier1 research, moderator Hot Time in the Big IDC Cooling, Power, and the Data Center 3 IDC vendors, 4 hardware vendors Michael Laudon, force10 Jay Park, equinix Rob Snevely, Sun Josh Snowhorn, terremark David Tsiang, cisco Brad Turner, juniper Brian Young, S&D The power and cooling crisis internet datacenters are getting full most of the slack capacity has been used up devices are using more and more power low power density - routers, full sized servers medium power density - 1u servers, switches high power density - blade servers Many data centers are full at 70-80% floor space utilized North America IDC occupancy is around 50% most sought-after space is around 70% full when power and cooling capacity is used up, floor space is vacant but can't be used. There is a relationship between power and cooling devices are not 100% efficient I^2R losses means that power becomes heat (conservation of energy) heat must be dissipated The ability to dissipate heat with normal cooling technologies is hitting the wall need new techniques Some quick rules of thumb a rack or cabinet is a standard unit of space from 30-40sqft per rack power is measured in watts many facilities do around 80-100w/sqft; at 30sqft per rack, that's about 3kw/rack high how did we get here? what is current situation where are we going? [dang, he's flying through his slides!!] Hardware engineers T-series hardware engineer for Juniper CRS-1 hardware E-series datacenter design issues for Sun, there were other hardware vendors who were not interested in showing up, these people were brave for coming up here! Josh snowhorn, IDC planner Jay Park, electrial engineer for equinix Brian Young, S&D cage design specialist What do the IDC vendors feel the current situation is in terms of power/cooling, how did we get here? Josh--designed datacenters at 100w/sq/ft, more than enough for the carriers; the server guys hit 100w/sqft in a quarter rack. you could cannabalize some power and cooling, but still ran out of cooling. Now spend hundreds of millions to make 200wsqft datacenters, or higher. Now, to hardware vendors--why are their boxes using up so much electricity, putting out so much heat? What are economics behind increasing density and heat load? From high-end router space--it's been simple, the bandwidth demand has grown faster than the power efficiency can keep up with. In the past, had the ability to improve keep up, do power spins about every 2 years, half power; but now bandwidth is doubling every year, but takes two years to drop power in half. We've been loosing at this game for a while, and running out of room on the voltage scale; 90nm is down at 1v, can't go much lower, since diode drop is at 0.7v; at 65nm, it's still at 1v, there's no big hammer anymore for power efficiency. Need to pull some tricks out, but may need to do clock gating, may get some 20-30% efficiency gains, but not much more that can be pulled out of the bag now. Newton was right; you can do some tricks, but no magic. Chip multithreading is one area they're trying to squeeze more performance out of; don't replicate ancillary ASICs for each core. Also can more easily share memory, and nobody has a 100% efficient power supply, so you lose some power there too. More and more getting squeezed in each rack. Also a drive on cost; amortizing costs over space and capability. reducing costs per port is a big driver. And customers are pushing for more and more density, since the cost of real-estate is getting so high, since each square foot costs so high. In Ginza, $120/sq ft for space. If you go to places where realestate is cheap, easier/cheaper to just build really big rooms, and let power dissipate more naturally. IDC people agree, some cities are just crazy in real-estate costs. But for those in suburban areas, cost of real-estate isn't so expensive. 3kw per blade server, put a few in a rack, you hit nearly 10kw in a rack. Soon, will need direct chilled water in the rack to cool them. But chilled water mixed with other colocation and lower density cabinets is very challenging to build. But need to have enclosed space to handle local chilled water coolers in localized racks. 20 years ago at IBM, nobody wanted chilled water in their hardware. Now, we're running out of options. Disagree--other ways of handling the challenge; how thermally efficient are the rooms in the first place, and are there other ways of handling heat issues? Cables with a cutout in tiles allows air to escape in areas that don't provide cooling. Josh notes the diversity between carriers at 40w/sq/ft vs hosting providers at 400w/sq/ft is making engineering decisions challenging. It's not about power really anymore, we can get power, it's about cooling now. Dealing with space in wrong terms--watts/sq ft, vs requirements of each
2006.06.06 NANOG-NOTES CC1 ENUM LLC update
(sorry these are coming out delayed, I had to deal with an internal routing challenge for much of yesterday afternoon. --Matt) 2006.06.06 CC1 ENUM LLC IPv6 DAY http://www.ipv6day.org/ 6bone is being shut down today, on the grounds that IPv6 is live and commercial, based on Jeordi's findings. Quotes slide, link to page you can register your apps on... Moderator for second session, Vish from Netflix, member of program committee. couple of topics to talk about; will start off with Karen Mulberry from Neustar talking about the US ENUM trial This is her first NANOG, very informative, interesting, entertaining. CC1 ENUM LLC --what is it? some background: north american numbering plan, 19 countries. formed sept 2004 by industry CC1 shared by 19 countries? US and canada and others. LLC obtained the CC1 ENUM trial delegation in Feb 2006 1 exists at RIPE, points to a server in Canada, waiting for the rest to happen. USG "guiding principle" and canadian government and carribean--interoperate, protect privacy, foster innovation, promote competition. US Trial is for End User ENUM ONLY applied to FCC for numbering for trial, waiver hasn't been given yet; only regional numbers, no 800, toll free, or other non-geo numbers used during trial No testing in enum.arpa? of carrier enum. CC1 ENUM trial test service as interface within CC1, specifically in US CIRA will host the temporary Tier 1 registry Each CC1 country must opt into ENUM trial, gets their own Tier 1 registry CIRA just handles 800 area codes for CC1 for US Canada itself has a trial committee, they are preparing their own corp. to handle Canada. And Jamaica is going to do their own. US Trial, TPAC is committee of trial participants, will produce trial results. Each country will do their own Tier 1B registry Trial roles--a number identified; Tier1B is a subset of a Tier1 registry Tier2 provider. Local exchange provider has to provide... [wow, slide went fast] Trial in 3 phases. registry infrastructure registry/registrar interface application testing phase 2 is under development; phase 3 has some proposals. Phase 1 is underway. TPAC (trial committee) -- 11 members signed MOU developed documents thus far TPAC US trial estimated timeline phase 1: registry infrastructure late june/july, lasts 2 months, starts after FCC grants waiver phase 2: registry/registrar interface expected to start aug, lasts 2 months depends on when phase 1 ends, depends on FCC waiver phase 3 applications later this fall CC1 timeline as of march 2006 [eyechart slide, good luck reading it.] By Q4 2006, an RFP will be issued for commercial tier1 and tier1B registries for CC1, goal to go live mid 2007. commercial operations 2 RFPs tier 1A (for all CC1) tier 1B for US expect to see the RFPs Q3/Q4 2006, beta late next year. Challenges facing enum defining the global standard for Carrier/Infrastructure /Operator/Provider ENUM Protecting end user security and privacy managing opt in requirements ensuring verification and authentication integrating domestic/global policy mandates. how do we integrate what happens in the US with the rest of the world. CC1 ENUM info resources CC1 ENUM LLC http://www.enumllc.com/ US ENUM Forum http://www.enum-forum.org/ Canadian ENUM Working Group http://www.enumorg.ca/ Q: What about bringing carrier/operator enum to IETF forum? A: working on it -- there was an announcement yesterday in regards to that. Moving on to next speaker now.
2006.06.06 NANOG-NOTES DDoS attack information collection
Information collection on DDoS attacks, Anna Claiborne, Prolexic Technologies. [slides are at: http://www.nanog.org/mtg-0606/pdf/anna-claiborne.pdf DDoS mitigation service. personal experience mitigating over 150 DDoS attacks. Popular topic, but nobody talks about how you can defend yourself or take legal action; only thing you can do is collect information. 0.1% of DDoS attacks end in an arrest, that's out of the reported number to the US Secret Service, and that's out of the ones that fall into their jurisdiction. These are real losses: A major US corp lost over $2mil in a 20 hour outage An offshore gambling comp. lost estimated $4m in 3 days Online payment processor lost $400,000 in 72 hours online retailer lost $20K/day over 3 weeks. These are directly reported losses; doesn't include lost PR, etc. Canadian retailer spend 50K on hardware mitigation, they got kicked out of 3 datacenters due to the DDoS attacks, spent 20K on IT and security consultants, and another $6K on a different mitigation that also failed. Basic Information Collection Get packet captures--either from machine being attacked, or a span port, or from upstream device, tcpdump -n -s0 -C (get full length of raw packet, limit pcap file to 5MB or smaller) take 3 or 4 over 15 minutes, to start, and then repeat every hour Determine the type of attack and duration (ex SYN flood lasting 6 hours) Obtain as complete a list as possible of source IP addresses Save bandwidth graphs, flow data, pps graphs, any and all visual material relating to the attack Save any contact with the attacker, email, chat conversation, phone calls, etc. Get loss figures from management--downtime, per hour losses, per day losses, section 18 of some law, have to substantiate losses over $5k before you can take legal action against someone. Recommendations have a plan! DDoS is stressful Put all attack information in a central location God monitoring doesn't have to be expensive, a simple fiber card in a 1u box can be a mirror port for a large volume of traffic Don't have to have expensive hardware like arbor boxes. Limit to 100mb to prevent killing your capture box. Graphs and flow data can be retrieved from upstream Find the source Use list of source addresses, find a reputable hosting company, you may even see a friend's IP Approach the network with the infected machine, give them as much information as possible, it can take time finding someone willing to help Obtaining information is dependent on who you are dealing with, be as helpful as possible. Get information from the infected machine netstat, tcpdumps, who is logged in, web logs, access logs Get and save the source code responsible process can take hours to weeks--prolexic has huge contact list, and even for them can be really difficult And SAVE all your information to a central location! and back it up! Examine the source code scripts are best, you know exactly what's going on compiled code, run strings on it best case, you can get a name or identification for who wrote it, passwords, domain names, port usage worst case you can obtain information that doesn't make sense...yet (it may fit into a bigger context later) Locate controlling server Examine TCP connection table or source code to find the controlling server verify your information, scan or connect to the suspect machine contact abuse where the server is hosted, explain the situation have as much information possible to verify your conclusion and validate your identity Good luck, most abuse contacts are less than helpful Raises a good question: how to improve awareness and legitimate requests answered. (may be able to get FBI to provide warrants to seize machines that are being used to control attacks against you, but takes time and documentation) Hunting the attacker (not for the faint of heart!) Review all information gathered so far on the attack contact the attacker, establish a report save all information and/or conversations (important note, if conversations aren't on a public server, they can't be used) Piecing the information together to form a high level view of the exploit, attack, and attacker A long process, most attackers are highly motivated and skilled, you usuallly have to wait for them to slip up! Resources: local FBI field office department of cybercrime department of homeland security CERT Cymru--great guys, if they have to help you NHTCU--EU, cyber crime divisions in local offices Local US secret service--division of electronic crimes DDoSDB.org -- under development at the moment. how to identify/recognize different types of attacks may be able to put their attack database open to the public up there. A success story The tracking of x3m1st/eXe responsible for hundreds of extortion based DDoS attacks tracked for months eventually lead to his arrest. hid behind four levels of compromised servers. eXe and his group only talked on private IRC servers; made the mistake of connecting from his home domain, from a m
2006.06.06 NANOG-NOTES network-level spam behaviour
2006.06.06 Nick Feamster, Network-level spam behaviour [slides are at: http://www.nanog.org/mtg-0606/pdf/nick-feamster.pdf Spam unsolicited commercial email feb 2005, 90% of all email is spam common filtering techniques are content based DNS balcklist queries are significant fraction of DNS traffic today. (DNSbls) Using IP address based spam black lists isn't so useful. How spammers evade blacklists will be discussed as well. Problems with content-based filters ...uh oh, some technical glitches... Content-based properties are malleable low cost to evasion altering content based on scripts is too easy customized emails are easy to generate content based filters need fuzzy hashes over content, etc. high cost to filter maintainers as content changes, filters need to be updated. constantly tweaking spamassasain rules is a pain. false positives are always an issue. Content-based filters are applied at the destination too little, too late -- wasted network bandwidth, storage, etc. ; many users recieve and store the same spam content. Network level spam filtering is robust (hypothesis) network-level propeerties are more fixed hosting or upstream ISP (as number) botnet membership location in the network IP address block country? are there common ISPs that host the spammers, for example? Avoid receiving mail from machines that are part of botnets. Challenge--which properties are most useful for distinguishing spam traffic from legitimate email? very little if anything is known about these characteristics yet! Randy gave a lightning talk last NANOG about some of this. Some properties listed. Spamming techniques mostly botnets, of course other techniques too we're trying to quantify this coordination characteristics how we're doing this correlations with Bobax victims from georgia tech botnet sinkhole other possilities: heuristics distance of client IP from the MX record coordinated, low-bandwidth sending looked at pcaps coming in from hijacked command and control station from bots trying to talk to it; spamming bots, Bobax drone botnet, exclusively used to send spam. Collection two domains instrumented with MailAvenger (both on the same network) sinkhole domain 1 continuous spam collection since aug 2004 no real email addresses--sink everything 10 million + pieces of spam sinkhole domain #2 recently registered Nov 2005 "clean control" domain posted at a few places not much spam yet--perhaps being too conservative contact page with random email contact, look at who crawls, and then who spams the unique email addresses Monitoring BGP route advertisments from same network Also capturing traceroutes, DNSBL results, passive TCP host fingerprinting, simultaneous with spam arrival (results in this talk focus on BGP+ spam only) Mail Avenger, not an MTA, it forks to sendmail or postfix, it sits in front of MTA, does things like do DNSBL lookups, add headers, passive OS fingerprinting, as the spam is arriving. Also logged BGP routes from same network that got the spam; see connectivity to the spamming machine at the time. Picture of collection up at MIT network. Mail Collection: MailAvenger X-Avenger header. best guess at operating system, POF, DNSBL lookups, traceroutes back to mail relay at the time the mail was sent (used for debugging BGP) distribution across IP space plot /24 prefix vs how much spam coming from it. steeper lines mean more spam from that part of the IP space; you can see where spam is coming from. bunch comes from apnic, cable modem space, etc. few interesting things to note; still redoing legitimate mail characteristics. from georgia tech mail machines, it's legit plus spam, need to split out better. between 90.* and 180.*, legitimate mail mainly. Is IP-based blacklisting enough? Probably not: more than half of spamming client IPs appear less than twice. Roughly 50% of the IPs showed up less than twice; but that's a single sinkhole domain, would help more across multiple domains. emphasizes need to collaborate across multiple domains to build blacklists; any one domain won't see repeated patterns of IPs. Distribution across ASes 40% of spam coming from the US BGP spectrum agility Log IP addresses of SMTP relays Join with BGP route advertisements seen at network where spam trap is co-located. A small club of persistent players appears to be using this technique 61.0.0.0/8 AS4678 66.0.0.0/8 AS21562 82.0.0.0/8 AS8717 somewhere between 1-10% of all spam (some clearly intentional, others might be flapping) about 10 minute announcement time of the /8 while spam is flooded out. Might be interesting to couple this with route hijacking alerting to filter out if this is really a hijacking vs a flapping legitimate route. A slightly different pattern; announce-spam-withdraw on a minute-by-minute basis. really really egregious! Why such big prefixes? flexibility: client IPs can be scattered throughout dark space within a large /8 same sender usually returns with dif
2006.06.06 NANOG-NOTES DNS reflector attacks
(I was going to try to get all the notes from today's panels out before going to bed, but I fell asleep on my keyboard finishing up these notes, so I think I'm going to wait and send the batch of Tuesday and Wednesday notes out after things wrap up on Wednesday. Sorry about the delay, but I need a bit more sleep I think. ^_^;; --Matt) 2006.06.06 Morning welcome, and introduction of Chris Morrow, panelist Please fill out survey today if you're going to be leaving! Frank Scalzo, Verisign Recent DNS reflector attacks. Attacker breaks into innocent authoritative DNS server, publishes large text record; then does queries from zombie army against that record, with sources spoofed with victim IP. 5 gig attack, 2.2G made it, 3gig didn't. E.TN.CO.ZA DNS attack, 64 byte query, 63:1 amplification, 4028 byte answer 34,668 reflectors. Victim sees 5G of traffic, 144,142bps per reflector, 13.5packets per second 4.5DNS answers per second. reflectors won't see this as anomolous for the most part; top talker only sent 8.5 answers per second. No visibility into the attacker at all, but best guess was 79Mb of source generated 5GB of responses. Record was maliciously installed; 2 auth servers, 1 compromised; 65% response, 35% name error. Answer comes in 3 fragments, larger than normal MTU. Attack came in 3 phases. first port 666, then port 53 and 666, then all 53. Port shifts are nearly instant, so fast command and control system in place for it. Filter out open recursive DNS servers; you can't put ACL in for 500,000 DNS servers. What about limiting DNS packets to 512 bytes? will break things. What about blocking 53 outside of your network hierarchy, force people to use your resolvers? What about discarding fragments? Challenge is getting your upstream to implement it, unless you have hardware and pipes to handle the flood coming at you to start with. Some ISPs won't do it unless they see live attack traffic, and a 24 minute attack is too short lived for ISPs to see and react to. data from Jan 11 - Feb 27 this year. Attack queries/second consistent with avg reflector qps. one reflector sent 1.9M DNS answers to 1593 victims, 605 different queries to generate answers. 180TB of attack traffic on Feb 1st. after feb 15th, ramped down. Assume 4KB response packet, see attacks between 3G and 7G, the scary part is that it only took 130Mb to generate the 7G attack, and the 3 gig attacks are all from less than a fastE connected compromised web server. 500,000 reflectors with 2G source could generate a 120GB DoS attack. Top victim got over 130Tb of attack traffic, top bunch are all over 100Tb 65,461 ports used, Top port is less than 10% of traffic though top 20 domains used, mostly innocent bystanders. Internet root . was second highest domain used; certainly can't filter *that* out. Fundamental challenge; UDP lacks 3 way handshake, easy to spoof DNS is easy target, so many unsecured DNS servers Other UDP servers need to be evaluated as well DNS closing 500,000 open recursive DNS servers will be very, very painful. poor separation between authoritative and recursive DNS servers. BIND allow-query ACLs, recursive DNS servers should not accept queries from outside. What if it's an embedded system like a wireless gateway? We depend on large records for DNSSec, etc. Beyond open recursive DNS servers root domain "." was used most authoritative name servers will answer with an upward referral doesn't include actual IPs, but it's still 438 bytes, and pretty much every DNS server responds to it. Source validation IETF BCP 38 How do you manage 70,000 ACLs on 500 routers? what about people who are multi-homed with static routes? what about legacy stuff that works but shouldn't? strict RPF breaks with traffic asymmetry; loose RPF doesn't help with this. ISPs see the problem as long, hard, expensive to overcome, and they're right. If we never start trying, we'll never fix it! Close open recursive DNS servers DNS servers should include filtering SOHO router vendors should fix their DNS proxy code, don't listen on outside interface BCP 38 otherwise we'll be jumping from protocol to protocol. Questions? Q: What does verisign do to protect their DNS servers? A: Anycast, massive peering and transit capacity Q: Jared Mauch, NTT/America; he turned on unicast rpf on the NANOG upstream link. 372,000 packets that people here have sent failed the RPF check. BCP 38 is hard Paul Quinn asked what percentage of the traffic that is. Bora Akyul, Broadcomm--any data on source ranges on the packets being seen? He could look at the 1 in 10,000 netflow sampling to see, but the individual link is a /30, looks like a normal customer link. The Merit router isn't RPF'ing either. Q: Ren Provo asks when they will peer; A: not yet, next few months, Miami Terremark, and other sites domestically and internationally in next year and a half.
2006.06.05 NANOG-NOTES BGP tools BOF notes
(ok, last set of notes for tonight, and then it's off to bed for 90 minutes of sleep before heading back to the convention center. ^_^; --MNP) 2006.06.05 Welcome to the 4th BGP Tools BOF! [slides are at http://www.nanog.org/mtg-0606/pdf/lixia-zhang.pdf Nick Feamster GeorgeTech Dan Massey CUS Mohit Lad and Lixia Zhang, UCLA The Goal sharing some tools develop from our research efforts. hopefully will be useful for operations community. Also to collect input on new tools we would like to see so they can develop them. Routing Configuration Checker Nick Feamster O-BGP data organization tool Dan Massey [slides are at http://www.nanog.org/mtg-0606/pdf/dan-massey.pdf The Datapository by Nick Feamster [I'm sorry, that just sounds *far* too much like something you do *NOT* want your bedside nurse administering...--MNP] Visualizing BGP dynamics using Link-Rank by Mohit Lad Open discussions and demos Nick Feamster Network Troubleshooting: rcc and beyond rcc: router configuration checker proactive routing configuration analysis idea: analyze configs before deployment many faults can be detected with static analysis. rcc implementation. http://nms.csail.mit.edu/rcc/ preprocessor -> parser -> relational database (mySQL), constraints <-> verifier <-> faults verifier is a template checker and set of constraints your configs are checked against. He's looking for GUI developers. very bare-bones command line right now. Parsing configurations--shows some output. He shows examples of the abilene configs, which are non anonymized. show all routers peering with a given AS, can look at route maps in each direction, etc. After running rcc on it, you get a web output which shows relationships--oh, pictures don't matter, with some more grease could be a reasonable representation of your network. Q: Randy Bush asks if it could show which peering sessions are missing? A: Not yet, but it could be added, thank you! Shows processing and errors; you get a page that summarizes the things RCC thinks are errors. Signalling partition? that's a missing iBGP session; he needs some better lingo in places. Also shows anomalous imports, could be intended for traffic engineering; that's "inconsistent policy" in ISP speak. Some of the names will get fixed to make Randy Bush happy. Yes, but surprises happen! link failures node failures traffic volumes shift network devices "wedged" ... two problems detection localization Need to marry static config analysis with dynamic information (route is configured but isn't in the dynamic table) he skips a closer look, just some jargon. Detection: analyze routing dynamics; drill down on interesting operational issues. idea: routers exhibit correlated behaviour blips across signals may be more operationally interesting than any spike in one signalling system. How do you spot things in the churn? Detection three types of events single-router bursts correlated bursts multi-router bursts <---common; and commonly missed using simple thresholds Localization: joint dynamic/static which routers are "border routers" for that burst topological properties of routers in the burst. proactive analysis -> deployment -> dynamic -> reactive detection -> diagnosis/correction -> static -> By going back to the configs, lets you see if it's something happening inside the network, or on the edge. Specific Focus: firewall configuration difficult to understand and audit configs subject to continual modifications roughly 1-2 touches per day federated policy, distributed dependencies each department has independent policies local changes may affect global behaviour (These are pulled from Georgia Tech; 130 firewall configs. Builds static connectivity matrix.) Reactive monitoring...use probes from subnets to verify reachability/connectivity. (immediate) open issues reachability and reliability of controller service-level probes diagnostic tools != service-level happiness policy conformance. Q: can it give suggested remediation, or provide config templates for new routers being added? A: Good idea! OK, over to next presenter. Helps with understanding BGP data. BGP data collection and organization (OBGP) Tool Colorado state university/university of Arizona/UCLA BGP data collection takes lots of BGP data, from RIPE RIS, etc. ISP BGP peer router -> update oreg -> rib+update -> feeds into gigabytes of data, different formats, potential errors enter in, and severe lack of metadata. Other tools can use it, LinkRank, BGP-Inspect, and a bunch of people cite it in reports and research. OBGP motivation Large Volume of Data data from many sources (RIPE, RV, private data) Long time scales and very recent (real-time?) data Slightly different formats RIPE/RV use different naming conventions different dump intervals different timezones for older data Lack of MetaData would like to only see desired peers and desired update types Possible errors in the data are updates missing
2006.06.05 NANOG-NOTES Peering BOF notes
(This time around I opted to go to the peering BOF and take some notes. It's the one downside to parallel tracks--wish I could be in two places at once. ^_^;; --MNP) 2006.06.05 Peering BOF Bill Norton introduces the Agenda; unfortunately, my laptop took so long to boot, I missed the Agenda slide. Doug Toy?, Transit Cost Survey, data collected at NANOG 36; he's just here to present the collected info, not really representing anyone. Recap: At NANOG 36, people indicated their cost per Mb and commit level. length of contract was usually 1-2 years. 42 data samples collected avg $25/Mb $95/$10 were the extremes. Avg commit level 1440 Mbps Other observations as expected, cost per Mbs tends to decrease as the commit level increases. Tier1's are more expensive Cost tends to vary more with Tier 1 providers than with others. between 0-500Mb commit level, prices are all over; at higher commits, prices level out at the bottom. Question: Mbps, is that the cap, the usage, inbound plus outbound? A: That's the general 95th percentile higher of the two inbound and outbound. Committed amount. Graphs tend to approach a hard bottom; as commit increases, doesn't change all that much. Bottom is around $10/Mb, even though commit levels increase. Of samples collected, 2/3 were from Tier1 providers. 90% of contracts are 1-2 year in length, so didn't cause much variance. Tier 1 definition is based on Wikipedia definition. Questions from audience? Q: Data looked pretty clean; were there samples pulled out to make it look cleaner? A: No, other than people who left fields blank on the survey. Q: was there a timestamp of when contract started? A: much of it wasn't complete. Mostly within the 1-2 year range for length as well as start date, so nothing really ancient in there. BillNorton; people had some concerns about violating NDA or contract details when filling out the survey. Where do we draw the line in doing these types of surveys? SteveGibbard; NDA is agreement between transit provider and customer, and this was anonymized and voluntary. Data is interesting, both for purchasers and sellers of transit. Q: 42 samples graphed, there were 80-100 people in the room at the time; so the real comments from the rest of people weren't counted? A: No, there were less than 50 submissions total, of which 42 were complete. Q: Patrick; many people put more than one transit provider on their form; how were those other transit providers handled? A: no clue, he just got a spreadsheet with data. Back to Bill Norton Peering Lists Issue -- make available to customer prospects? 15 mins. Peering disclosure dilemma: customers often asked for peering list, sometimes peerings restricted under NDAs Metric for determining connectedness, capacity, resiliancy. Is there a better metric for customers? IX capacity in/out Peering pipe size? ISPs are getting commonly asked about this, based on hands raised in the room. How many people lose business because the customer doesn't get an answer? Sylvie, VSNL notes they provide the info when they're under an RFP; they won't give capacity, they give an aggregate, they won't go peer by peer, that would be a violation of NDA. BillN: are the NDAs written to allow total numbers like that? Sylvie: you should not disclose capacity per location or per circuit, but they don't forbid aggregate total numbers. BillN: is there something else that could be given to the customer that would satisfy their question without revealing what Chuck: A lot of ISPs lie about their peerings; he runs AMSIX, people claim to have multiple gig to the peering exchange, he knows they don't really have that much. Patrick: but he can look at the peering stats on AMSIX--Chuck notes only members can. Patrick: customers ask how many gigs they can send to a provider; it's available headroom, so they ask their upstreams how much available headroom is left. Most providers are having a lot of trouble getting the right capacities to the right networks. The reason many don't answer is they don't like the answer they have to give. Ted Seely, Sprint, how do you solve the problem? There's lots of traffic that needs to be exchanged, how do you fix it? Patrick: how about everyone upgrade to 10gigE in many places? If you can't afford it, stop selling bandwidth. But most people can't go to all the different providers, they have to buy from a small subset of providers. RAS: No technology problem with doing it; it's the money. Not charging enough to cover the costs of the technology you need to install to cover the bandwidth. Ted notes you can't just link at one spot, you have to connect at six places, and you need to have links in and out of the site to support the volume, etc. Patrick: can you tell us how much they exchange? 40Gb times 6 providers in six locations is probably more traffic that Sprint has in total. Ted Seeley; it's a time scale issue--yes, it can be solved, but in what timeframe? BillN feels it's reasonable information for
2006.06.05 NANOG-NOTES Network Neutrality panel notes
(since there's no slides for these online anywhere, and the slides were going past pretty quickly, I have to apologize for the gaps in the notes ahead of time. --MNP) 2006.06.05 Network Neutrality Panel [slides are not yet online] next up is the controversial subject of network neutrality; Bill Woodcock will be chairing the panel, so Randy Bush can go be a member of the audience again. Bill Woodcock: network neutrality has been in front of press and legislatures for the past several months, and has been in the works behind the scenes for almost a year. Brokaw Price, peering at Yahoo! Sean Doran, free agent, "rooting" for Sprint back in the day Sean Donelan, now at cisco, Gene Lew, at neustar now, has done cable operations before. Sean can pretend to be Vint Cerf for this panel. :D Network Neutrality--what does it mean to operations people? History: Michael Powell, Feb, 2004, defined four internet consumer rights (chairman FCC) freedom to access content freedom to use applications freedom to connect personal devices freedom to obtain service plan information History of net neutrality concept: feb 2005, madison river telephone company consent decree "Madison River shall not block ports used for VoIP..." August 2005, FCC policy statement access lawful internet content run applications and services subject to the needs of law enforcement connect legal devices that do no harm to the networks competition amongst... All of these principles are subject to reasonable network management. That last bullet is what gives telcos ability to quote QoS as a mandatory network management requirement. March 2006, internet non-discrimination act senate bill 2360 only 2% of americans have a choice in last mile. shall not interfere with block, degrade, alter, modify, impair, or change bits or content May take measures to protect customers from attacks may protect their own network infrastructu May 2006 Internet freedom and non-discrimination modifies clayton anti-trust act. Passed based on party lines. Turns over to Sean to talk about his thoughts. Sean Donelan Doesn't represent anyone right now. The Huck Finn approach. And no, you can't configure this on your router. What are we talking about? Rep John Conyers (D-MI) "internet as we know it is at risk" Same guy noted in 2003 about cable operators being smart enough to not poison their customer pool Not really a new issue: ANS CO+RE in 1991 unapproved networks filtered from R&E gateways ANS and CIX, June 1992 ANS agreed to "provisionally" interconnect CIX proposed filtering resellers (194) NSF NAP/NREN solicitation (1993) required NSPs connect to all priority Network Access Points (NAPs) uncertainty created opportunities for new service providers pizzahut.com debate--couldn't reach it from university networks, but you could reach it from UUnet. Two-tiered internet even back then. Regulations chasing change TitleI -- General FCC Authority (pre 1984) TitleII common carrier voice/phone calls/later data TitleIII spectrum licensing broadcast TV and radio TitleVI Cable Television (post 1984) FCC moved DSL (but not UNE-L and cable modems) back into TitleI again. VoIP is still unknown. Telecom act of 1996 was another biggie. before 1996, enhanced services transmitted over a common carrier. After 1996, info services were defined by the telecom act; they run over telecommunications provisioned by anyone, not just common carriers. That's why cable companies can offer telecommunications even though they're not common carriers. Also why Radio and TV can put IP in subcarrier without being common carriers. So, we have a really odd blend, they're not mutually exclusive. Now you have the potential for new entrants from any direction. Most home services now come over cable companies; multiple resellers of same product wasn't enough to satisfy customers. Customers are very fickle Interfering with a customer's use of the Internet would hurt the provider's business. No-one can predict what the next "Killer App" will be. and *everyone* can complain Both sides need each other to succeed. Predicting in advance what customers want and will consider improvement vs interference is hard to the point of being nearly impossible. And what customers consider an improvement one year may become interference, or what was once interference will now be considered improvement. If you start writing regulations, you imply investigative and reporting requirements along with it. Who would enforce these regulations? And regulations seldom prevent people from being evil, the government simply sets the price for being evil. Broadcast fairness doctrine--equal time; nobody can buy public advocacy on national networks, except for politicians, who get the best rates. Kingsbury committment--ATT had to interconnect with everyone--but that meant you didn't need a second long distance company, so ended up supporting monopoly expansion. Universal service backfired by giving a monopoly to thos
2006.06.05 NANOG-NOTES IPv6 deployment at Comcast
Randy Bush, moderator of the next section He begged to do the introduction for a specific reason; deployment of IPv6 that is beneficial to this companies' P&L; possibly the only one in existence thus far. He did a very studied and purposeful view of using IPv6 to benefit his company! IPv6 @ comcast Managing 100+ million IP Addresses [slides are at: http://www.nanog.org/mtg-0606/pdf/alain-durand.pdf Alain Durand Office of the CTO Director IPv6 Architecture [EMAIL PROTECTED] Agenda Comcast needs for IPv6 Comcast plans for IPv6 Challenges simplistic view of comcast IP problem 20 million subscribers in video 2.5 set-top boxes per subscriber 2 IP per set top-box per DOCSIS std. total 100 millions IP addresses needed that's not including high speed data, nor comcast digital voice, nor mergers/acquisition Used to use RFC1918 for cable modems. that space was exhausted in 2005 Comcast recently was allocated the largest part of net 73 and has renumbered cable modems in that space. In the control plane, all devices need to be remotely managed, so NAT isn't going to help us IPv6 is the clear solution for us However, even we are starting now, the move to IPv6 isn't going to happen overnight. Triple play effect on the use of IP addresses 2005 HSD only 2006 T+ Cable Modem 1 1 Home computer/router1 1 eMTA (voice adapter)0 1-2 Set top box (STB) 0 2 total num of IP addresss 1-2 8-9 (assume 2.5 STB per household IP Addresses: Natural Growth vs New Services nice graph--based on trends, not real data Contingency plans: use public address space use "dark" space (pre-RFC1918 space) federalization (split into separate domains) (trying to avoid that) IPv6 strategy start early deployment plans started back in 2005 deploy v6 initially on the control plane for the management and operation of the edge devices they manage DOCSIS CM, set top boxes, packetCable MTA (voice) be ready to offer customers new services that use IPv6 LATER, not now--first step is to just be able to manage their own gear. migration to v6 must be minimally disruptive. deploying v6 must be in roadmap for all vendors ops, infrastructure, systems must be ready to support v6 devices. over time, IPv6 will penetrate Comcast "DNA" Deploy v6 for IP addrs of the CM and STB architecture: dual-stack at the core, v6 only at the edges deployment approach: from the core to the edges backbone->regional networks->CMTS->devices this is an incremental deloyment; existing deployments will be untouched in the beginning Follow same operational model as with IPv4; lots of DHCP! News Flash: All routers on Comcast IP backbone are IPv6 eanbled first ping on 10GE production backbone TTLs aren't quite working properly, still checking on that. [so, even mainstream vendors still don't have v6 working quite properly yet] New CM will be v6 ready (dual-stack capable) On an IPv4 only CMTS, CM will have v4 address only On v6 enabled CMTS, CM will only have v6 address No CM boxes will have both; if they could support v4 on all, wouldn't have this issue to start with! Provisioning, Monitoring, Back-Office mostly software upgrade problem not unlike the Y2K issue fields need to be bigger in database and web scripts Should system "X" be upgraded for v6? does it communicate with devices that are v6 only? payload Q: does sstem "x" manipulate IP data that could be v6 (store, input, display) Comcast inventory analysis About 100 systems 10 need major upgrades for transport 30 need minor upgrades just for display/storage Back office management of cable modems. network transport will still be v4 however, back office systems may need to be modified to display/input/store v6 related data (CM v6 addr) Payload can be v6 while transport is v4. IPv6 certification Basic IPv4 compliance taken for granted today IP level component testing is limited IPv6 is still new technology maturity level of vendor implementations vary greatly some have v6 for 10 years even those have features not fully baked others have nothing, will rush to buy 3rd party stack. Bar for v6 product acceptance has to be higher than what we typically accept now for IPv4 Formal v6 requirement list before purchasing v6 conformance testing/certification to accept product v6 training most engineers have heard about it, don't know much fear factor can expect new hires to have 2-4 years of v4, but 0 v6 initial and continuous training process is critical! v6 vendors CM (cable modems) (DOCSIS 3.0/2.0b) CMTS Router Provisioning system OSS Video/Voice back-end systems Retail Market (Consumer electronics) Home Gateways Video (eg TV with embedded cable modem) Right now, provisioning system is most challenging. v6 protocols MIBS: some OSS vendors implement RFC2465 (deprecated) some router vendor implement partial RFC4293 (new combined v4+v6 MIB, but onl
2006.06.05 NANOG-NOTES interdomain routng via Wiser, Ratul Mahjan
2006.06.05 A simple coordination mechasims for interdomain routing [slides are at: http://www.nanog.org/mtg-0606/pdf/ratul-mahajan.pdf Ratul Mahjan David Wetherall Tom Anderson University of Washington now @ Microsoft Research the nature of internet routing within a contractual framework, ISPs select routes that are best for themselves. Potential downsides higher BW provisioning requires manual tweaking to fix egregious problems inefficient end-to-end paths An alternative approach: coordinated routing ISPs have joint control path selection is based on the preferences of both ISPs Potential benefits lower BW provisioning no egregious cases that need manual tweaking efficient end-to-end paths basis for interdomain QoS Existing mechanisms cannot implement coordinated routing route optimization boxes help (stub) ISPs pick better routs from those available MEDS implement receiver's preferences. Cannot create better paths that don't already show up in the routing table. What makes for a good coordination mechanism? MEDS have some nice properties ISPs can express their own metrics ISPs are not required to disclose sensitive info lightweight requires only pairwise contracts Provides joint control and benefits all ISPs. Our solution: Wiser operates in a lowest-cost routing framework downstream ISPs advertise their cost upstream ISPs select paths based on both their own and received costs. Problems with vanilla lowest-cost routing ISP costs are incomparable Can be easily gamed When you bring incomparable costs together, the ISPs that use higher costs win out. Cost normalization Normalize costs such that both ISPs have "equal say" Normalize such that sum of costs is the same. Makes the system harder to game. Bounds on cost usage Downstreams log cost usage of the upstream ISPs Compute the ratio of avg. cost of paths used and announced Contractually stipulate a bound on the ratio. Similar to existing ratio requirements. Wiser in action Announce costs in routing messages. normalization occurs between ISP pairs. Example results using major ISP topologies for experiments Wiser provides better control under link failure. Wiser produces shorter path lengths Implementation XORP prototype Simple, backward-compatible extensions to BGP embed costs in non-transitive BGP communities border routers jointly compute normalization factors and log cost usage a slightly modified BGP decision process Benefits even the first two ISPs that deploy it. Summary Wiser is a simple mechanism to coordinate interdomain routing may lower provisioning, reduce manual tweaking, produce more efficient paths and help with interdomain QoS Feedback: [EMAIL PROTECTED] http://www.cs.washington.edu/research/networking/negotiation/ Danny McPhereson: Q: how do you normalize across multiple ISPs? A: routing advertisements happen on the sum of the costs announced from me to you, and from you to me. He derived it from different values in his experimentation; utilization and latency in general. Q: Randy Bush: Whatever metrics are, you just normalize by summing them up. But Danny notes if you have multiple ISPs, how do you normalize them together? Q: Danny: where was the 20ms of control plane savings seen that he claimed in slide 11? A: That was based on default ISP policy, prefer customers over peers, etc. So the delay was control plane plus data plane; it wasn't control plane alone. He based it on the old rockefuel equation. Randy Bush: vendors--this is cool stuff, open your ears. Break time now.
2006.06.05 NANOG-NOTES Pretty Good BGP Josh Karlin
2006.06.05 Pretty Good BGP Josh Karlin, Stephanie Forrest, Jennifer Rexford slides are at: http://www.nanog.org/mtg-0606/pdf/josh-karlin.pdf Main idea: delay suspicious routes lower the preference of suspicious routes for 24 hrs Benefits: network has a chance to stop the attack before it spreads accidental short-term routes do no harm no loss of reachability adaptive simple Algorithm Detection: monitor BGP update messages treat origin AS for prefix seen in past few days as normal new origin AS treated with suspicion for 24 hours. treat new sub-prefixes as suspicious for 24 hours. Response: suspicious prefixes given low localpref, not used or propagated suspicious sub-prefixes are temporarily ignored Example prefix hijack (without PGBGP) same specificity Example sub-prefix hijack (without PGBGP) two /9's cut from a /8 In these examples, AS 5 acted in its own self interest, but it helped protect the rest of the net beyond it. Simulations of two deployment strategies Random, and core+random. Random, with 0 deployed, half the network will be affected, better solution as higher fraction of ASes deploying it. If core of network deploys (core ASes have at least 15 peer-to-peer links) only 62 out of the 20,000 ASes. All but 2% of network protected with that. Sub-prefix hijack suppression a bit tougher, but still good results as core implements it. hijacks in the wild 1997, AS 7007 sub-prefix hijacked most of the internet for over 2 hours Dec 2005 26-95 hijackings during month jan 2006, panix's /16 stolen by conEd Feb 26, 2006, sprint and verio carried TTNET as origin AS for 4/8, 8/8, and 12/8 IAR: internet alert registry IAR verifies hijack attempts a near realtime database of suspicious routes email alerts are sent to those who opt-in for the ASes they choose to recieve alerts for operators recieve alerts only when their AS has caused the hijack or is the victim Tier1 ASs receive one hijack alert per day typically working prototype Solutions with guarantees (and lots of overhead) sBGP soBGP psBGP Anomaly dectors Whisper MOAS lists Geographic based Good Practice proper route filters Route filters protect the internet from you and your customers, not vice versa. Why pretty good BGP? maintains autonomy incrementally deployable no flag day no change to the BGP protocol Effective with a small deployment only requires a software upgrade or change in config generation. Most important, requires minimum operator intervention http://cs.unm.edu/~karlinjf/pgbgp/ Q: (someone)? from UCLA--if you delay the route for 24 hours, if the original AS withdraws it, what happens? A: you'll still end up using the new route, as it just has a lower localpref, so moves will still work. Q: Danny McPherson -- what if origin AS is spoofed to match the origin AS by the hijacker--does this stop it? A: No, that's a man-in-the-middle, or at least it looks like it, and this can't handle that, so it's only pretty good; that would be a later phase. Q: He also notes if your prefix is hijacked, your email alert is likely to get jacked as well. A: True--subscribe from multiple prefixes/domains to be safe! Q: Phil Rosenthal, ISPrime. What happens when a small ISP in south america leaks the internet to an upstream that doesn't filter them? A: Yes, those leaks suck up a lot of memory; this doesn't help because the origin AS is still correct, but the intervening paths are bogus. If the route for a sub-prefix is seen with the origin AS along the path, not seen as a hijack. Q: Jared Mauch, NTT america; follow-on point, you just have a strange AS along the path, but the rest of the origin is correct. A: No, they don't look at the whole path yet; maybe in the future Q: Sandy Murphy, Sparta--thinking of statement at the end, it handles backup routes ok. it works best where operational changes of the origin happen at a human-paced interval. There are some prefixes which seem to oscillate at a much more rapid pace. What about studying prefix behaviour over a longer period of time? Is it locked into 24 hours, or can be adjusted to match better frequency? A: Not locked at 24 hours, could be adjusted to different 'sensitivity' as needed. Q: Randy Bush, IIJ: The internet is not static, those things which relay on viewing it as static like route flap dampening can bite us. We need to enable more and more dynamic behaviour, not less, and Randy thinks this is going the wrong direction. A: That's nice, but presenter disagrees and thinks this is a helpful step in the right direction.
2006.06.05 NANOG-NOTES AS-PATH prepending measurements
2006.06.05 Active measurement of the AS path prepending method. [ slides are at http://www.nanog.org/mtg-0606/pdf/samantha-lo.pdf This is the research forum part of the meeting, people doing real research on real networks. Samantha Lo and Rocky KC Chang department of computing {cssmlo,[EMAIL PROTECTED] Kowloon, Hong Kong Dr. Rocky Chang is her supervisor. Motivations: Apply AS-path prepending on a trial-and-error basis to control the inbound traffic. How effective can the AS-path prepending method be? what would happen to the routes after prepending on a link? The measurement setup; dual-homed stub AS. connected to 9304 and 4528 Two upstream links, L1 and L2. Announce a beacon prefix to both links with prepending on L1. graph of prepending length on the X axis. from 0 to 5, then back down. Wait 6 hours between each change to stabilize. goes from 102:29 at 0 on L1, to 14:91 at 5 on L1. Greatest change is between prepending length of 2 and 3. When decreasing, see an unbalanced phenomenon. Who was responsive to prepending? Incoming link to beacon prefix changes, next-hop of routes also changes in remote AS Passive-responsive are those where the next-hop for the route didn't change, but the subsequent path is different. Active-responsive, next-hop actually changes. Non-responsive ASes, see no change. 43 ASes no change in either incoming link or next-hop On L1: 14 ASes use one next-hop only Passive-responsive ASes 26 ASes incoming link change no change in next-hop Active responsive ASes 47 ASes: both incoming link and next-hop changes possible reasons: apply shortest-path policy no localpref override. Active responsive ASes: UUnet, Teleglobe, bunch of others, slide went pretty quickly Most of them are located 4 AS-hops away from L1; after prepending, they are 5 AS-hops away from L2. Routes to L1 at 4, via L2 at 6 when starting. What if both ASpaths via L1 and L2 have the same length? equal to or greater policy: AS1239 located 4 AS hops via L1, and 5 AS hops via L2. AS3662 has prepended once. So prepending once on L1, 5 < 6, no change. prepending twice on L1, 6 = 6, route changes to L2 even though they're equal. AS3257, located same as 1239 when increasing prepending to 2, L1 is 6 (4+2), L2 is (5+1), but still uses L1. When increased to 3, 7>6, it finally changes to L2. When decreasing to 2 again, it's equal again, but it doesn't flip back to L1 until the prepending is down to 1, at which point 5 < 6, then it finally shifts to L1. This is the "greater than" policy. Same prepending length, uses different routes. 'sticks' to previously used path. BGP update graph. After prepending, update messages continue for several hours. Conclusions and future work Route changes are introduced by active-responsive ASes shortest path policies topology -> when they will change possible applications predict amount of traffic shifts discover the upstream ASes policies Thankss to Michael Lo and Lorenzo Collitti Q: Randy Bush, IIJ, notes that from her slides, Tim Griffen describes her "delayed reaction" as 'BGP wedgies'. It comes from the BGP tie breakers, it's not something you'll be able to predict. A: She notes it's a policy choice of tiebreakers. Q: Randy insists it's not a matter of policy choice; it's built into BGP, and not something they have control over. Moving on to next speaker
2006.06.05 NANOG-NOTES TCP authentication with Ron Bonica
2006.06.05 Ron Bonica slides are at http://www.nanog.org/mtg-0606/pdf/ron-bonica-joint%20presenters.pdf Authentication for TCP-based routing and management protocols, from Juniper. A joint presentation, Alcatel, Cisco, Juniper. Starts at NANOG at Washington 2 years ago, security BOF; someone said they would MD5 auth if they could update keys without bouncing their sessions. Suprisingly small number of people actually using MD5 authentication. Motivation many ops don't authenticate TCP based routing protocols RFC 2385 doesn't meet operator needs. Concerns: CPU utilization not so much of an issue for Juniper, [Cisco, Alcatel] Juniper architecture separates forwarding and control plane Key management hard to change keys requires bouncing sessions Weak cryptography easy attacks against MD5 Alternative approaches Application: in the protocols (BGP, LDP, etc) TLS --too much of a headache transport TCP Network IKE/IPsec Chosen Approach: better TCP authentication enhanced TCP auth option Hitless key rollover key chains configured on peer systems time based key rollover key identifier Stronger cryptography HMAC-SHA-1-96 CMAC-AES-128-96 Enhanced Authentication Option Kind - Length T/K Alg ID Res Key ID KEY Key chain contains a tolerance parameter up to 64 keys each key contains id [0..63] auth algorithm shared secret start and end time, both for trans and receive Sending system procedure identify active key candidates start-time <= system-time end-time > system-time if there are no candidates, log event and discard outbound packet If there are multiple candidates, select key with most recent start-time for sending Calculate MAC using active key calculate over TCP pseudo header, TCP header, and TCP payload by default, include TCP options (if you set the T bit, ignore TCP options) Format enhanced auth option active key ID ... Receiving system proc. lookup key specified by TCP option determine whether that key is eligible startime <= system - tolerance end time > endtime + tolerance [not sure if that shouldn't be end time > system time + tolerance, actually. --MNP] Calculate MAC if calculated MAC matches received MAC, accept the packet auth error procedure discard datagram log do NOT send indication to originator (doesn't adjust TCP counters) Config example: see examples on slide deck, they went past too quickly. Q: how many of us are authenticating IBGP sessions? A: majority in the room are Q: how many of us are interested in a better way of handling key changes? A: lots of people! Q: Russ Bundy?: are you planning on taking this work into IETF to publish through that path? A: Yes, went to RPsec, RPsec2, and SAG working group mailing lists. Q: Randy Bush, IIJ, were there any simpler proposals? Clearly this was designed for the IVTF. A: None that weren't already rejected by the team themselves. Q: Steve Bellovin, Columbia U. No longer security IAD. Why reject IKE and IPsec? It does all this, plus more (which isn't so good) Why not tie algorithm to the key, get it out of the header, get more bits for other uses? A: actually, alg. could be taken from the key; that's the type of comment they're looking for in the IETF; one arg for putting it in option is that is a quick way to check without calculating the MAC, second Q: is more interesting; why not use IKE with just auth. -- no need for confidentiality in this case. It was discussed, one idea was to just use IPsec with preshared keys, but then you have same key rollover system, and key negotiation. Those are all probably good ideas. Would like to do this as a first phase, allow for manual key rollovers, and in a second phase, you can negotiate a key for one-time use. Q: but in IKE/IPsec, you can use preshared key mode in IKE; A: but in this case, you'd still need a system like this to roll over the keys since you want to be able to change keys on each end asynchronously Christopher Ranch: Made the right choices, thank you! Q: Eric ? from cisco: why is this more complicated? A: being able to have multiple keys and roll them over. There are networks that have used the same key for 10 years since they don't want to bounce their sessions. You just can't do that with IKE. This is an operations driven requirement, that it be hitless. Q: Jared Mauch, NTT america: how does he go in and take his iBGP session, roll to this system without making the NANOG mailing list? A: [no answer provided] Q: Bora Kilf?, Broadcom: about IKE not being able to roll keys without a hit; if you use IKE v1, you can lose the IKE SA, have the IPsec SA, rekey your IKE SA, and then rekey your IPsec SA. He agrees with steve, looks like they're re-iventing large parts of IKE/IPsec all over again. Q: Gary? want to avoid colo meets; you want to be able to re-set keys without having to coordinate people in different timezones. Encourage people to participate in SAG to discuss this and provide feedback. Research forum speakers up next.
2006.06.05 NANOG welcome notes
(getting my notes from today's talks out, finally. ^_^ ; --Matt) 2006.06.05 Welcome notes Program chair, Steve Feldman Thanks to Rodney Joffe, Neustar/Ultra services People who were instrumental in getting connectivity into the room here deserve a big round of applause NANOG program committee, Joe Abley ...wow, slide went fast. ^_^; Agenda changes--none so far. Remote participation: streaming options: http://www.nanog.org/streaming.html Realmedia multicast MPEG2, Reminders: network security don't use cleartext passwords! do use end-to-end encryption (ssh, vpn) PGP key signing see link off www.nanog.org for details Beer and Gear tonight. Interpreting badges Blue badge: steering committee yellow badge: program committee red badge: mailing list committee green = blue + yellow Green dot == peering coordinator Black dot == network security red dot == PGP signing Lightning talks six 10 minute slots available on-topic for mailing list signups http://www.nanogpc.org/lightning deadling is 12:30pm tuesday talks will be wednesday morning. Over to Rodney Joffe now. Welcome to NANOG 37 Almost the NANOG that wasn't. significant effort, thanks to Merit and others; an uphill battle, both from time and location. Encourage other people to host! Not expensive, just takes time. Benefits of Hosting: choose the location (sort of): much easier to do at home. if you wait too long, you don't have a choice of venues. Tee-shirt: you get to pick designs for it! Wonderful engineering opportunties! NANOG Community--you get to give something back to the community. He's hosted through three companies now over time. Exposure--touched on by Randy and Bill yesterday. Generate favourable goodwill for your company; it's not a marketing event, but still gets name out. Network Architecture Plan A: Existing SBC/ATT OC12 at demarc cool! we should be able to get connectibity real cheap Nope: SJCC exclusive licensee says $10K charge to access it, even if gear is donated. Plan B: AboveNet GigE at the DEMARC, hot but unused, and owned by AboveNet from an earlier conf. Nope: SJCC exclusive licensee still has mandatory $10K access fee Plan C: Hilton has fiber run to SJCC infrastructure OK. Let's use microwave to connect Hilton to MAE-WEST 55 S. Market st. Plan C: Microwave from the Hilton to Joe Pucik's 13th floor office at mae-west fiber from 12th floor to 2nd floor meet me room cross connect to S&D across S&D fiber to PAIX Palo Alto to Jared Mauch/NTT But... 55 S Market == Faraday cage. :( No signal. FC quality glass. Plan D: Tango 54mb microwave from hilton to Dave 'Bungie' Rand's 18th floor penthouse roof at 50 west san fernando fiber from bungi to switch and data PAIX, to Jared Mauch at NTT. Hotel wasn't the biggest challenge, connectivity was. First thing you need to do after picking a hotel that can handle 500 people, 200 rooms. ONLY pick hotels that have fiber with someone you recognize from NANOG. Second thing is to make sure there aren't any access or use fees that have nothing to do with the equipment or bringing gear in. Otherwise, you find there are interesting charges that can be levied against you as you progress! Shout outs: Jared Mauch Christopher Queseda Joe Pucik? Ralph Whitmore Dave Rand UltraDNS and Neustar Volunteers! $10,000 colo space--picture is evil. Other end is the penthouse of the Knight Ridder building. Nice shot!
2006.06.04 NANOG Open Community Meeting Notes
Here's my notes from tonight's (overly!) long Open Community Meeting. When I said "yes" to going long vs cutting the discussion short, I was thinking we'd go long by 10 minutes or so...not by a whole hour+ ^_^;; Matt 2006.06.04 NANOG Open Community Meeting NANOG/San Jose NANOG SC <[EMAIL PROTECTED]> AGENDA Steering Committee (Randy) Program Comittee (Steve) Financial Report (Betty) Mailing List Report (Rob) Open Discussion (Randy) Aside from vetted presentations, we're speaking as individuals, and we differ! The microphone is open throughout. NANOG Steering Committee Report SC Goings On Get meeting schedule under better control thank you, Rodney! Charter amendments voted on in October ML Policy and Proceedures Copyright and IPR issues Press, photography, ... Future Meetings Thank you Rodney for pulling chesnuts out of the fire! October in St. Louis with ARIN since it's meeting with ARIN, it's Monday/Tues, they have Weds-Fri Joe organizing Toronto first week of February 2007 Josh is looking at Miami for the first week of June 2007 Mailing List Still working with ML Panel to document their process Still working with MP(ML?) Panel to develop an appeals process (vote in October) Statistics are published monthly on the NANOG web site Please volunteer for the ML Committee a selection will happen in Oct. 3+1 so far, that's the lowest limit right now. Need to share the load. ML Panel Appointments No terms, etc. in current charter Straw proposal charter change parallels SC and PC two year terms staggered two sequential terms max without a vacation ML Panel process cont. this would give members a light at the end of the tunnel volunteers would know what they're signing up for Allows change without bad vibe of removal normal organizational practice Charter Changes Randy and Steve F are working with Dan Golding and Steve Gibbard, the old charter group, to get the known charter revison proposal pieces in order for San Jose Meeting. (for voting in October) (not sure they'll quite make it) Dan Golding, there's a lot of bootstrapping language in the current charter, changes will remove the bootstrapping language, lists terms, which things are staggered, etc. Hoping to publish by the end of this meeeting. Mostly bookkeeping. Rights in Data NANOG trademark is held by MERIT Presentations are copyright by the author Right to freely distribute, but not modify, granted to NANOG PC is drafting this formally copyright notices on slides are OK if small and unobtrusive But what about rights to Streaming and Videos? Press Press likely to be present in San Jose MERIT may prominently tag their badges MERIT will ask that no pictures be taken in the actual meetings themselves This ensures that members are free from having their picture published without their consent and without prejudice as to who is taking the pictures. Ren has been taking pictures of individuals to post on the Multiply site; so, if you haven't had your picture taken in the hallway, talk to her. But again, not in the meeting room. Mailing LIsts Engineering and Ops Discussion only <[EMAIL PROTECTED]> Discussions about NANOG itself <[EMAIL PROTECTED]> Steering Committee <[EMAIL PROTECTED]> Program Committee <[EMAIL PROTECTED]> ML <[EMAIL PROTECTED]> Agenda Over to Program Committee (Steve) Steve Feldman, PC Chair Disclaimers--all opinions, errors, are his program is result of hard work by the whole PC NANOG 37 program 37 submissions, up from 26 22 accepted 1 cancelled 14 rejected No breakdown on the rejections; they were fine, but there wasn't enough time to put them all in; some will resubmit for next time. Areas for improvement Speaker solicitation--PC still needs to solicit more; get broader representation of NANOG community Scheduling this meeting was harder, didn't know where it was going to be. Program Format Mon-Weds format Morning Plenaries Afternoon BOF, Tutorials Evening social events (moved program out of evenings for more social time) Newbie meeting--beefed it up with more content from Bill. About 50/50 split in terms of how many like each format. After October, will use survey results for subsequent meetings. Lightning Talks Criterion: on-topic for mailing list Signups start Monday morning instructions during plenary deadline: 12:30 Tuesday PC selects 6 best submissions Take place during 1 hour of Weds morning plenary Feedback Talk to us! PC members have yellow and green badges Send mail [EMAIL PROTECTED] [EMAIL PROTECTED] And fill out your surveys!! Open Discussion What topics would you like to hear more (or less) about? How can we (PC and community) get speakers on those topics? Charter revisions some proposed PC updates, take out his name, bootstrap details out of it; specify when the PC is selected by the SC, things like that. Draft will be published on the futures list. Transparency more debate recently; how much of the workings of the PC should be visible to the outside. PC members feel the discussion has to be in
2006.06.04 NANOG new attendee orientation meeting notes
Here's my notes from tonight's 'NANOG new attendee orientation meeting'. Matt 2006.06.04 NANOG New Attendee Orientation NOTES: NANOG Organization Steering Commitee (blue badges) Program Committee (yellow badges) decide what's on the agenda (green badges are both) Mailing list Committee (smoke badges) Merit Network staff NANOG Organization (2) Meeting hosts (Rodney Joffe) Rescued NANOG after previous host pulled out Sponsors pay for beer and gear, breaks The NANOG Community Community meetings Meeting surveys Elections [EMAIL PROTECTED] Don't forget to fill out the surveys!! Meeting structure Format: Plenary sessions (big room) Tutorials sorta like panels, like Network Neutrality panel. BOFs Usually tools, security, peering, at least. not recorded, not webcast, more informal, more candid. Social events Beer and Gear, tomorrow evening. 8 exhibitors showing off latest gear Etiquette Mutual respect is the big thing; no personal attacks; can criticize ideas, but avoid ad hominum attacks. Dress Code Must wear shirt and shoes at all times. Pants or shorts would be nice, too. The press and photography Members of the press may be present Photography is not permitted during sessions due to legal liability/copyright issues. it's also distracting! All sessions except BOFs are recorded and webcast will be available for replay later off nanog.org Reporters should identify themselves if they speak with you. Program selection process 17 on the program committee, one selected by Merit call for presentations deadline ratings: each PC member rates each submission on a scale of 1-5 and adds comments conference call to determine consensus Usually most are good, some need suggestions on what is needed before they can be approved. Program process 2 second round of comments second round conference call anything after that: chair's discretion Some stuff had to be turned away this time, as schedule was full. Now over to Betty Burke for Merit update. NANOG 37 newcomer meeting Betty Burke, Project Manager, Merit Network she'll be giving overview of Merit, and why it's involved with NANOG. Bill Norton will go more through the relationship through history. She also handles the Michigan technology center, so she covers multiple hats. Merit and NANOG Continuation of shared values Committment to R&D National Involvement Regional Educational Activities Much of the background between merit and NANOG is in the shared research and development focus. Merit originally focused on the Michigan area research network, now covers research and development for widespread activities, nationally and internationally. Merit is 503.1c entity in Michigan, non-profit, one of the largest network providers in Michigan; they do R&D only, no commercial side. NANOG hosts and sponsors hosts: work with Merit to locate a hotel, provide connectivity, build the hotel network, and staff the meeting break sponsor: an engineer from your organization exhibits your equipment on a tabletop display. break slots are 30 minutes one vendor per event beer n gear: dipslay your equipment at a table staffed by two engineers. Merit is 501c.3 regional network, 40 years young Owned by Michigan Public Universities hosted at University of Michigan located in the Michigan information technology center with Internet2 Org chart Merit Board Merit CEO--new, starting in July, reports to Board Directors - R&D, Network Managers Staff Hierarchy is to allow decisionmaking, but the hierarchy isn't rigid; ideas can flow in either direction as needed. Mission statement: to be a respected leader in developing and providing advanced networking services to the research and education community. Merit is a trusted source for providing high-quality network infrastructure: initiating and facilitating collaboration; and providng knowledge and technology transfer through outreach... Supporting our Mission MichNet, Merit's statewide network as well as Internet2 Research and Development. Bill Norton, unvetted slides. Freshly unslept (new kids will do that to you) NANOG History (v0.2) William B. Norton Co-Founder/Chief Technical Liason Equinix, Inc. [EMAIL PROTECTED] He's used to dealing with lots of suits, with translators, transcribers, etc. Why did the Frenchman have only one egg in his omlette? One egg is an oeuf! What do I know about NANOG? Merit Staff 1987-1998 NANOG chair 1995-1998 Developed 1st Business Plan for NANOG financially self-sustaining Started number of NANOG traditions NANOG T-shirts Numbering NANOGs Colored NANOG NameTag Beer-n-Gear Cookie-graph Surveys Etc Q. What to expect in a typical day? A. Current meeting structure 2.5 days; it may look like a terminal room Sunday-Wednesday Sunday is Newcomer's welcome, and community meeting Monday-Main sessions... NANOG spreads travel burdens (6 was my first NANOG in San Diego) Still pretty much true that it's cheaper if you stay over Saturday night in terms of flights. 1987-1994 NSF funded $->Merit
Re: Have Yahoo! gone pink?
On 3/29/06, Peter Corlett <[EMAIL PROTECTED]> wrote: [I'm wearing my personal hat here.]I'm getting a *flood* of spam coming in from Yahoo! mailservers, both to mypersonal and work addresses. It seems that Yahoo! don't care. Here's theresponse to me piping a sample one through Spamcop: http://abuse.mooli.org.uk/yahoospamYahoo claim "After investigation, we have determined that this email messagedid not originate from the Yahoo! Mail system. It appears that the sender of this message forged the header information to give the impression that itcame from the Yahoo! Mail system."The spam headers claim otherwise:Received: from mrout3.yahoo.com ([216.145.54.173]) by relay-1.mail.uksolutions.net with esmtp (Exim 4.50) id 1FJbCW-0002Ag-IV for [EMAIL PROTECTED]; Wed, 15 Mar 2006 18:58:29 +As does DNS and whois:[EMAIL PROTECTED]:~$ host 216.145.54.173173.54.145.216.in-addr.arpa domain name pointer mrout3.yahoo.com.[EMAIL PROTECTED]:~$ host mrout3.yahoo.commrout3.yahoo.com has address 216.145.54.173[EMAIL PROTECTED]:~$ whois 216.145.54.173OrgName:Yahoo! Inc.OrgID: YAHOOI-2Address:701 First AvenueCity: Sunnyvale StateProv: CAPostalCode: 94089Country:US[etc]Doing double-DNS lookups of the IP addresses on other spams also giveyahoo.com hostnames, and they're typically in DNSBLs for being sources of spam and a useless abuse address.So, which IP blocks shall I null-route then? Or is there anybody here fromYahoo! with a clue? (OK, you can all stop laughing now.)Ewww. p4pnet.net is part of a company Yahoo acquired that is still in theprocess of being integrated. :(Personally, I'd just null-route the blocks--I'm sure it'll decrease the loadon the Internet as a whole while Yahoo works on trying to clean up their acquisitions. Of course, that's me speaking for myself, and not in anyway shape or form speaking for my employer. ^_^;;There are spam clueful people at Yahoo from the NANAE and anti-spamcommunities--when stuff like this shows up in public forums, it does get noticed and passed along. I agree, it would be better if it could garnerthe right level of attention without being called out in public forums like this, though.Matt --PGP key ID E85DC776 - finger [EMAIL PROTECTED] for full key
Re: shim6 @ NANOG
On 3/4/06, Iljitsch van Beijnum <[EMAIL PROTECTED]> wrote: On 4-mrt-2006, at 14:07, Kevin Day wrote:>> We got lucky with CIDR because even though all default free>> routers had to be upgraded in a short time, it really wasn't that>> painful. [Because there was no need to renumber]> Isn't that an excellent argument against shim6 though?> In IPv4, something unanticipated occurred by the original developers> (the need for CIDR), and everyone said "Oh, thank god that all we > have to do is upgrade DFZ routers."You are absolutely right that having to upgrade not only all hosts ina multihomed site, but also all the hosts they communicate with is animportant weakness of shim6. We looked very hard at ways to do this type of multihoming that would work if only the hosts in themultihomed site were updated, but that just wouldn't fly.And given that any network big enough to get their own PI /32 has *zero* incentive to install/support shim6 means that all those smaller networksthat are pushed to install shim6 are going to see *zero* benefit when theytry to reach the major sites on the internet.What benefit does shim6 bring, if only the little guys are using it? This dog won't hunt. Move on to something useful.Yes, this is an issue. If we have to wait for a major release or even a service pack, that will take some time. But OS vendors havesoftware update mechanisms in place so they could send out shim6 codein between.And no major company supports/allows automated software update mechanisms to run on their production machines--it adds too much of an element of randomness to an environment that has to be as muchas possible deterministic in its behaviour. But again, it cuts both ways: if only two people run shim6 code,those two people gain shim6 benefits immediately.Cool. So let individuals make a choice to install it if they want. Butthat's a choice they make, and should not be part of a mandated IP allocation policy, because otherwise we're codifying a split between"big" companies and everyone else. The companies that can justify /32allocations _aren't_ going to install shim6; they already have their multihoming options (for the most part) covered--so the little guys whoinstall shim6 to "multihome" are going to discover it doesn't do diddlysquat for helping them reach any major sites on the internet during an outage of one of their providers. You haven't preserved end-to-endconnectivity this way, you've just waved a pretty picture in front of thesmaller company's face to make them think they'll have the benefits of multihoming when they really don't.> Getting systems not controlled by the networking department of an > organization upgraded, when it's for reasons that are not easily> visible to the end user, will be extraordinarily difficult to start> with. Adding shim6 at all to hosts will be one fight. Any upgrades > or changes later to add features will be another.One thing I'll take away from these discussions is that we shouldredouble our efforts to support shim6 in middleboxes as analternative for doing it in individual hosts, so deployment can be easier.Won't matter. shim6 on a middle box still won't be able to re-route to themajority of the large sites on the Internet during an outage on one of theupstream providers given that the large content players and large network providers aren't going to be installing shim6 on their servers and loadbalancers. > The real "injustice" about this is that it's creating two classes> of organizations on the internet. One that's meets the guidelines> to get PI space, multihomes with BGP, and can run their whole > network(including shim6less legacy devices) with full redundancy,> even when talking to shim6 unaware clients. Another(most likely> smaller) that can't meet the rules to get PI space, is forced to > spend money upgrading hardware and software to shim6 compatible> solution or face a lower reliability than their bigger competitors.And that's exactly why it's so hard to come up with a good PI policy: you can't just impose an arbitrary limit, because that would be anti-competitive.You failed to note that the smaller company, *even after spending moneyupgrading hardware and software to shim6 compatible solution* won't achieve the same reliability as their bigger competitors. (see above if you missed it).shim6 is _more_ anti-competitive than extending the existing IP allocationpolicies from v4 into v6, and is therefore not going to garner the support of the companies that actually spend money to create this thing we call theInternet. And without money behind it, the effort is a non-starter. > Someone earlier brought up that a move to shim6, or not being able> to get PI space was similar to the move to name based vhosting(HTTP/> 1.1 shared IP hosting). It is, somewhat. It was developed to allow > hosting companies to preserve address space, instead of assigning> one IP address per hostname. (Again, however, this could be done> slowly, without forcing end users to do any
NANOG36-NOTES 2006.02.15 talk 3 Katrina Panel
2006.02.15 Katrina Recover Panel moderator: Sean Donelan, Cisco Members: Paula Rhea, Verizon Josh Snowhorn, Terremark Bobby Cates, NASA Sean Donelan was with SBC when Katrina hit, now with Cisco. Dave couldn't be here, but Sean will do his Bellsouth slides. Lessons Learned Industry has to be able to function as a first responder to provide critical infrastructure in support of state/local response. certain sectors may need heightened support, including power and voice/data communications Providing security in times of crisis may fall back to the private sector Need to understand how the Government works in a crisis National Response Plan, FEMA system, etc. Bellsouth lost COs for first time in 100+ years of business. When you get a direct hit, you will be impacted, period. More important is how your recover! Most national disasters are pretty quick; we know how to deal with short term, but as the issue drags on, security for personnel becomes more and more vital, and is turned over to private sector, public security is engaged on more important issues. We need to help shape up the government to avoid issues like Katrina from happening again. Bellsouth, lsessons learned partnerships with other carriers, state and local government, the power companies, and the federal government made the difference experience and trust are key in a crisis Get involved--know how to reach teh communications ISAC and national coordinationg center in a crisis 703-607-4950 NCS at NCS.gov or NCC: telecom-isac at ncs.gov operational 24/7/365 Konw what programs are available to you and your customers GETS/TSP/WPS Bobby Cates from NASA up next. Supported first responders right after Katrina; they were providing video coverage, supported voice over IP, sat phones, etc. in the first days. The commercial facilities were better than gov't lines, actually. When president came in, military took over all satellite frequencies, so VoIP over commercial internet was what was left. Phones from Bill Woodcock from PCH, servers from some bay area folks; got gear loaded onto a C5 that was warming up and flew it out, the costs was less than one set of satellite phones. TSP was interesting for Katrina; for higher bandwidth, higher pricing, not much diff for TSP and non-TSP restorral. Wimax and voip pretty much saved the day, easy to implement. Josh Snowhorn, NOTA, didn't take Katrina too bad, but Norma? hit him hard. Only 3 cat 5 hurricanes (andrew, 92, camille 89, and 1935). cat 4 and cat 3 hit more often, hundreds a year of the smaller ones. Saffir-simpson hurricane scale. cat 1; winds 74-95, wind, water cat 2: 96-110, storm surge 6-8ft cat 3: 111-130, surge 9-12ft much structural damage cat 4: 131-155mph, surge 13-18ft at landfall (katrina at coast) cat 5: 155mph, surge 18ft Wilma was cat 5 before landfall, as was katrina; wilma was lowest barometric pressure every recovered. 27 named storms last year, lots of warm water heading into the gulf. formed off the bahamas, very little warning before it hit into south florida. did a bunch of power line damage. NOTA faced many issues during the storm. 2005 most storms in recorded history 2005 hurricane season went 27 named storms representing first time in history that the naming scheme went into the greek alphabet lowest NOTA--pre wilma, 3 happy balls 100mph winds on the curtain should be able to withstand it. Lost one of their roof balls during wilma NOTA lost commercial power, went on gensets for 31 hours during katrina in July NOTA lost commercial power for 10 hours with wilma, but had to stay off for 30 hours it was so dirty majority of enterprises and business in s florida without power for 10 days the day after wilma, had no less than 20 truckloads of servers and infrastructure arrive at NAP loading docks with sales people and contracts within 2 days of the passing of wilma, we began to recieve phone calls askign for fuel truck help from undersea cable operators and large enterprises; everyone pitched in to help all of the other operators in the area 12 undersea cables coming in, you cut them off, s.america largely goes away. only 1 carrier fully lost a CO in north miami, bringing down their circuits that came out of the NAP; water came in, shorted things out. Many companies did not plan properly for power failures and staff recovery and access to systems after the storms have passed. large portion didn't have DR plans or backups staff who loose their homes need food/water, won't go to work thow who want to work cannot go to devestated offices so they need to work from phone getting employees acces to systems is the singular issue that IT diretors face post katrina and wilma KEEP a dialup access poitn; it's often the only thing left in a disaster like that. Sean: to NASA; for packet traffic, what traffic did you see--lots of traffic you didn't plan for, or business as usual. The emergency response was for NCS, FEMA, DoD, as well as thei
NANOG36-NOTES 2006.02.15 talks 5-end Lightning talks, closing notes
(they weren't kidding about lightning!! ^_^;; ) 2006.02.15 Lightning Talks: Infrastructure (DNS and Routing) Security - Status and Update by Sandra Murphy Need for Speed: What's next after 10GE? by Mike Hughes A Brief Look at Some DNS Query Data by John Kristoff The impact of fiber access to ISP backbones in .jp by Kenjiro Cho New Network Monitoring Interest Group by Mike Caudill Understanding the Network-Level Behavior of Spammers by Nick Feamster (presented by Randy Bush) 12:20-12:30 Closing Remarks Steve Feldman, CNET, Susan Harris, Merit Reload your agenda for the slides!! Fun with gnuplot, DNS query data, John Kristoff X asis, source port of client query to DNS server; Y axis, how many times that port was used. Looking at recursive server for an institution open to inside and outside on 2005.11.22 starting at 1024, lots of clients use that port, then declining to the right; 1025 is most popular port; wraps at 5000, windows starts over. to the right, UNIX boxes start with high ports. Port 137, windows stuff, all bogus windows lookups Port 5353, is multicast DNS, MACs use it, also bogus Some very interesting outliers, either misconfigured or poorly thought out OS/stacks. Graphs are similar at different institutions, and at large ISPs. If you take out the external queries, points below 1024 (except 53) seem to be machines behind PAT boxes. Port 1900 is plug and play port, so windows can't use it, so it's a low outlier. external queries show more outliers in low range. looking at PTR queries internally; no elbow at 1025. 5353 standout is still there, multicast PTR queries, all bogus. MX queries, same thing. stuff, not many outliers, very clean; possibly bogus, though. A windows box trying to contact IRC server (neutered bot box); keep using same source port over again until firewall/virus software moved it. UNIX box used port range constantly across the range, more normal (trojaned box) Normal UNIX box shows more normal rows of different ports. looking at source ports, what other useful info and patterns can you start to discern? Look at TTL, dest ports, all sorts of fun you can start to discover. Sandra Murphy sandy at spart.com sandy at tislabs.com DNS and routing security DNSsec is live, sweden has signed top level zone, RIPE signing reverse zones, some reverse delegations. http://www.dnssec-deployment.org/ open working grope, dnssec deployment initiative focused on deployment issues, active mailing list, regular telecons. organizes workshops at conferences, etc. screenshot of the site; has roadmaps, working group signups, mailing lists, operator guidelines, links to NIST, etc., events, and actions. DNSSEC-tools project create tools/patches for web browsers and such. http://www.dnssec-tools.org/ current release is v0.9 from 2/10/2006 Firefox 1.5RPM to check DNS sec records back Shot of tools being released.. zonesigner tool is how you sign and maintain a signed zone. Some very detailed documents on how you sign and maintain a signed zone, as well as mailing lists. sourceforge link for dnssec-tools bundle Securing the routing infrastructure: big problem, no traction on deployable solutions 3 workshops with a wide net of interested parties. operators, iSP, access, content providers, vendors, security DHS hosted, anxious to find a solution http:///www.hsrpacyber.com/public/ Operators' emphasis a strong call from the operators for an authenticated list of authorized prefix originations (accurate, complete secure) respond to customr requests to route prefixes useful in debugging routing difficulties NEW ARIN policy suggestion recommendation new field in address templates (direct and subdelegations) for list of permitted ASes Benefits inhereits self-discipline of completign form (IRR entries aren't always done) inherits scrutiny of ARIN process on creation ARIN is authority for who is allocated prefixes Any IRR would have to check prefix with RIR Authentication and currency in IRRs authentication IRR objects RIR run IRRS have internal access to authentication for prefix holders non-RIR run IRRS would have to find a way to get that authentication from the RIRs samee is true for RIR IRR objec referring to nonmember resoureces Currency for IRR objects reclaimed resources have to result in IRR purges why not a TTL in IRR objects? Handles non-RIR IRRs This solicits requests and feedback. Try the DNSSec tools, try signing a zone, see how it works. Try the client system that does the DNSsec validation. Participate in ARIN ppml list on routing security, etc. Mike Hughes, what's next after 10GE mike at linx.net Channels geoff huston for scary graph. curve of traffic growth. By end of 2006, he'll be at 150Gb; if he takes last 3 months, he'll be at 300Gb in one metro. where is it coming from? ADSL2, Wimax, FTTx, skype, voip, p2p, etc. consolidation fewer people with bigger pipes. think back to seattle chap from force10 came and asked what do you want, 40g or 100g? we
NANOG36-NOTES 2006.02.15 talk 4 Interdomain Routing Consistency
Access point movie goes whizzing past very quickly as Bill Fenner narrates. Lets you see where people are congregating, and which talks are more interesting, and when people migrate out of talks; could feed into the survey to tell the program comittee which talks are of more interesting. netdisco, collects data from network elements, plots them, put a front end on it; If you opted in, by emailing him you MAC address, it would render a map with your location on it. has RSS feeds of your location as well. fenner at research.att.net 2006.02.15 An Inter-domain Consistency Management Layer Nate Kushman, MIT Steve Feldman, welcome back, Nate Kushman is up first to talk about routing consistency. Transient BGP loops was with akamai, now at MIT srikanth kandula, dina katabi, john wroclawski Do loops matter? can we do something about them? what is a transient BGP loop? slide showing loop forming. How common are "transient BGP loops" Sprint study, IMC 2003, IMW 2002 looked at packet traces from the sprint backbone up to 90% of the observed packet loss was caused by routing loops 60-100% could be attributed to BGP Is it true on internet? Routing loop damage 20 fvantage points with BGP feeds did pings, traceroutes, watch for loops. correlated on BGP updates, and ttl exceeded on ping, traceroute. In fact, all loops were within 100seconds of BGP updates. 10-15% of all BGP updates caused routing loops!! Collateral damage. they cause impacts on congestions that are part of the loop, causing loss to non-rerouted networks from non-rerouted-to source networks. traceroute to see which links were part of the loop, see which other traces shared a link in common with the loop. there is a marked increase in packet loss in the 100second window around the BGP loop. Prefixes sharing a loopy link see 19% packet loss in general. What should be done? We need to prevent forwarding loops. A loop occurs because: one AS pushes a route update to the data plane, but other ASes are not yet aware of that route change. What about telling everyone about the change before the change actually happens? Suspension: continue to route traffic tell control system not to propagate the route FIB stays same for now, RIB doesn't send route. downstream networks only update forwarding tables once upstreams have acknowledged the path change. More generally: we have proven: loops are prevented in general case convergence properties similar to normal BGP http://nsm.lcs.mit.edu/~nkushman/ incrementally deployable. feedback Clearly: works well for planned maintenace. We can delay move to backup path during those events, at least. 20% of update events caused by planned maintenance Link up events also cause loops, no way to plan for them smoothly now. What about: unplanned link down events trade-off between loss on current path and collateral damage Are we willing to do this in general, to avoid impacting stable prefixes from unstable prefixes. In short: routing loops are a significant performance concern. Bill Norton--hidden question: what is the time domain during which these traffic impacts are seen? Will the propagation path take 10, 20, 30 seconds? A. one event causes many, many loops rippling out, so one update may cause packet loss for many seconds, up to tens of seconds total. Q. you're talking about adding MORE state information into the network. Also adding latency to update acknowledgements. Jared notes that router software bugs tend to exacerbate routing loop issues. You can tune configs to try to minimize the number of loops seen, as well as upgrading to "fixed" code to get better results without more state. Patrick Gilmore asks jared, does tuning help internal sessions or external sessions? Both, it really controls *when* the updates are sent out (immediately vs batched, etc.). Jared notes the internet is being used Someone (Bill?) asks if convergence times are similar to current model, as the slide claims; is that within a few seconds? convergence in the lab is similar, yes. Matt Petach asks about details of convergence; it basically puts you at mercy of the slowest, farthest away router on the network, since it has to get the message, realize it has nobody to send to, and then acknowledge back before anyone else can update FIB; yes, true, so you'd want to put timers in to limit how long you wait; basically, like "wait 5 seconds, and either hear an ACK, or go ahead and update FIB" type timeout, so you don't wait forever for a non-conformant device on the other side of the world. Riverdomain question--with suspension, you're basically in passive mode, listening but not updating, is that correct? Yes, with respect to the links/prefixes in question.
NANOG36-NOTES 2006.02.15 talk 2 Katrina--telecom infrastructure impacts
2006.02.15 Hurricane Katrina: Telecom Infrastructure Impacts, Solutions, and Opportunities, Paula Rhea, Verizon A more interactive presentation from her, in the aisles. Verizon Business group--combined MCI/Verizon team. Agenda Hurricane Katrina Recap Telecom infrastructure impacts telecom provider successes business continuity planning conclusions references appendices: case examples. Many of the people in this room would be considered part of the critical infrastructure for the nation by the department of homeland security After world trade center 9/11 issue, there was a lessons learned; hopefully there will be a similar report post Katrina. New Orleans is still very much like a war zone right now; it's definitely a disaster recovery training session for many industries. Neighborhoods are wiped out; no capital investments, infrastructure in holding pattern. Many with no power, 20% of houses condemned. Neighborhoods that are entirely silent--eerie. Aim is not to diss anyone specifically, certainly not in this room; aimed to be an assessment in a neutral fashion. Critical infrastructure: food and water supply energy transportation healthcare banking/finance telecommunications/infrastructure Oddly enough, much of critical infrastructure is privately owned, rather than government owned. The domino model says that any one piece will cause the rest to start to fall. 35th largest city in US port of new orleans is #1 in US by tonnage 50% of total US grain exports shipped via gulf 10.8% of total US refining from new orleans 5th largest port Key space shuttle facility in Michoud supported fuel tanks for international space station Storm recap hirricane hit aug 29 2005 135MPH winds, 20foot storm surge sent inland 55foot surges logged in gul pior to landfall levee failures create secondary crisis 2.3M homes without power spawned 33 reported tornadoes in NA 1090 fatalities in LA recorded to date people dancing about cat 5 dropping to cat 4, thought they were spared, then levees broke; had been predicted the year before. :( Still 2500 people missing/unaccounted for. Map of eastern LA parishes st bernards/plaque mines parishes between the lake and the gulf, hardest hit when levees broke as water headed back towards gulf. Lack of interoperability between parish govt systems. New orleans telecom impact (multi-carrier) 1.75M lines down immediately following kat. 38 911 centers out (1/3) initially 1,000 cellular towers out two class 4 toll switches initially out of service no power/unable to secure extended diesel fuel Traffic out of lata logjammed with toll switches out. LECs had backup power systems, but no fuel. Took 4 days to inspect causeway to allow emergency crews into the city with main bridge out. Most nurses and doctors were in suburbs, not in city. Central offices post katrina new orlenas lake co CLLI NWORLALK Venice LA CA CLLI: VENCLAMA Buras CO CLLI: BURSLAMA 19 COs are totally destroyed, and will have to be rebuilt. These slides are public domain info, no inside info. I2/Abilene link from Houston to Atlanta initially out, restored on sept 8 2005 fiber optic path on lake pchatrain bridge offline following hurricane katrina wifi, wimax and voip play key role in area communications public internet was actually very resilient Telecom provider successes: alphabetic 1,000 amateur radio operators helped bellsouth cingular cisco cox, iridium added 10,000+ new phones to first responders MCI Nortel Sprint/Nextel donated up to 10M Verizon donated 8M and 200 workers Carriers have mutual aid agreements; Verizon sent 200 people who volunteered to spend 8 weeks living in a tent to help rebuild--had to work with armed guards. The CO rebuilds wasn't any type of upgrade, it was bulldozing damaged/destroyed facilities, digging new vaults, and starting over to restore just what was in place before hand. Bill Norton--COs underwater, can you imagine some type of preventative design that could have been put in place to help avoid impacts like this? Most of the area is reclaimed land, 2 miles below sea level (some dispute about that number). Bill wonders if they could be built above sea level somehow. Even if they were, Paula notes that they wouldn't have power, wouldn't have 2 weeks of diesel fuel to run them, etc. Really, it comes back to the levees. Randy Bush noted that early on, community based wifi was one of the early-on means of communication to daisy-chain packets along. Roland, from Cisco; did some logistical work with relief; Verizon donated eVDO boxes to make eVDO to wifi bridges, did VoIP over wifi to eVDO boxes to juryrig connectivity. But doesn't work so well with towers down, and no power. With the cell phone infrastructure down, that really hurt too. Thanks to Todd Underwood/Renesys for their graphs; did a pre-and-post analysis routingwise. Top red is LA; about 170 networks totally out during the bulk period. teal/TX not impacted, MS also hit, in tierms of percentage more s
NANOG36-NOTES 2006.02.15 talk 1 ipv6fix (and boy, does it need it)
Morning intro notes--don't forget to fill out your SURVEYS six lightening talks signed up, should be very cool. If you have slides, get them to Steve Feldman start with! Wireless movie after break should be cool to watch. Ren? Steve mistakenly introduces her, she corrects them. Don't forget to give feedback via the Survey forms!! 2006.02.15 v6fix: Wiping the Slate Clean for IPv6 Kenjiro Cho, WIDE/IIJ, Ruri Hiromi, WIDE/Intec NetCore Will be talking about their efforts to deploy IPv6, called v6fix. v6fix is an effort to solve problems in the current v6 deployment. focuses on v4/v6 dual stack environments. it's a technical analysis of real world problem Kenjiro will talk about tools and measurements. deployment status majority of equipment out there is v6 available from major vendors still many applications and appliances just work with v4 v6 is starting to get into various business fields Many people lack knowledge/experience with v6. when non-experts hit problems, they're clueless. Example: illiteracy. Hotel internet systems have instructions for guest. troubleshooting: if you have IPv6 enabled, please disable IPv6--brochure in guest room. Cause of problem: combination. DNS redirection returns specific A record for clients stub-resolver accepts the A for , can't get out. Wiping the slate clean for the v6 faulty behaviours only 1% and combinatorial often, but could be fatal to deployment. slow fallback to v4 after v6 errors misbehaving DNS resolvers filtering of ICMPv6 DNS misconfigurations poorly configured tunnels lack of peering or v6 paths v6fix activities (research group) identify/analysze/solve real-world tech problems in v6 deployment. Enemy: "after disabling v6, my problems went away" Cooperation needed between researches, implementers, ops. v6fix topics harmful effects of the on-link assumption. misbehaving DNS servers and resolvers slow fallback to v4 after v6 failures Examples: case 1: DNS loop at hotel real story of hotel internet system--went to same room, investigated. DNS is intercepted, redirected to signup page ipv6 users can't get beyond first page hotel instructions say to disable v6 erroneus DNS redirection system and stub-resolver redirection system always returns specific A record when getting non-A queries client's stub resolver queries for any address, blindly accepts A return response. case 2: DNS server slowdown Japanse ISP ISP upgraded a DNS cache to BIND9, recieved complaints about slowdown. recompiling BIND9 with --disable-ipv6, fixed problem, reported to JANOG Caused by older BIND9 w/o IPv6 connectivity server w/o v6 connectivity always tries to talk over v6, ends up failing back to v4 after timeouts fixed in BIND9.2.5 and 9.3.1 Common factors 1 problems appear only with specific combinatorial conditions 2 implementors and operators didn't notice until reported 3 even for professionals, not easy to track down problems. Kenjiro Cho, Tools: v6 tools and measurement results Goal: to understand the macro-level v6 healthiness current methodologies wide area meaasuremetn of behaviours of 2nd/3rd level DNS servers dual stack issues DNS server measurements of .jp domain responses: 0.13% DNS servers can't deal with requests Most are lame delegation type errors. ignore queries respond with RCODE 3 ("name error") NXDOMAIN dual-stack path analysis measurement techniques specifically designed for dual-stack take measurements for v4 and v6 at same time compare v6 results with v4 results extract problems that exist in v6 only methodology dual-stack node discovery create dual-stack node list by monitoring DNS replies. dual stack ping run ping/ping6 to target dual-stack nodes select a few representative nodes per site (/48) by RTT dual-stack traceroute trace/trac6 to selected nodes visula v6 MTU to look at issues visualize path issues distribution of v6/v4 RTTs 4000 ping targets v4 on x-axis, v6 on y axis individual nodes far above unity line--leaf issues paths and PMTU visualization from NYSERNET to ARIN sites Many of ARIN paths via jp! (lack of peering) >From ISC to ARIN sites--paths look much better, but lots of blue == lots of tunnels Abilene case: well known problem. Abilene trying to encourage v6 adoption no AUP, tunnel services for v6 but ended up with horrible v6 paths, mostly with tunnels ISPs are reluctant to move to paid v6 connectivity Abilene thinking about suspending its relaxed AUP for v6 tool tries to illustrate such issues, convince users to move to native v6 dual stack traceroute to ABILENE from WIDE (v4 upper, v6 lower) similar RTTs/hops for v4/v6; native dual-stack paths dual-stack trace to ABILENE from IIJ similar RTTs, but different paths: currently more common dualstack traceroue to ABILINE fro ES v6 RTTs much larger than v4: roundabout tunnels Conclusion: faulty behaviours are only 1% and often combinatorial, but can be fatal to acceptance of v6 slow
NANOG36-NOTES 2006.02.14 Tools BOF Notes
Last notes of the day... Matt 2006.02.14 Tools BOF Todd Underwood, panel moderator A number of interesting tools presented earlier today; all of them are good and interesting and solve a particular set of problems. None are in widespread use. There's a lot of possible reasons; do they solve problems you don't have, in which case they can move onto something new; or they solve a problem similar to one you have, but not quite. Or they solve a problem you can't quite implement yet. Discuss use cases, problems they're trying to solve, and give feedback, as interactively as comfortably as people can. 3 tools, OpenBGPD, IRR powertools/webtools (to get feedback and is the IRR even useful anymore?) and Flamingo as one of 2 netflow platforms. Start with Henning, active in open source software development; he'll go in more depth on openbgpd. OpenBGPD Henning Brauer henning at openbsd.org 3 process design Principle of least privilege the RDE (route decision engine) does not need any special priv at all, so it runs as __bgpd:__bgpd: chrooted to /var/empty SE needs to bind to TCP/179 parent needs to modify kernel routing table. Session Engine (SE) needs to bind to 179/tcp we have the parent open sockets see recvmsg(2) parent needsd to keep track of which fds the SE has open, so it doesn't bind() again to same ip/port the SE can drop all privs, then. SE 2 since one process handles all bgpd, need nonblocking sockets. on blocking, you call write(2), won't reurn until it's done or get errors on nonblocking, returns as soon as it can't proceed immediately So, have to handle buffer managmeent SE 3 designed an easy to use buffer API and mesg handling system. Messaging internal messaging is core comp. in reused for OpenNTPD, OPenOSPFD, and somee more. bgpd has more than 52 message types, more than OpenSSH bgpctl talks to bgpd using same imsg socket tcp md5 some very old code in kernel for tcp md5, from 4.4 BSD never worked tcp md5 is somewhat similar to ipsec, ah, so implement it within IPSec maze. Had to add pfkey interface to bgpd; committee designed API. that made IPSec that much easier; extended the API so they can request unused SPIs from kernel, don't have to be configured manually. tcp md5/ipsec when you don't have tcp md5 or ipsec in place, big tcp windows are risky stay at 16k window unless you have tcp md5 or ipsec, then you get 64K so ipsec improves performance. Joel Yagli asks how big a tcp window do you need for a BGP session at all? initial connection gets faster with 64K, but thereafter, similar. looking glass just added an optional second control socket that is restricted to the "show" operations regular bgpctl binary can be used with it cgi, yeah, that needs to be hacked in shape, but it's easy. Juniper only does static IPSec setup, so requires nasty setup. OpenBGPD is dynamic, but interoperates with Junipers. So back to looking glass, security on OpenBSD, the httpd (an apache 1.3 variant) runs in a chroot jail by default th readonly socket can be placed inside that jail bgpd_flags="-r /var/www/bgpd.rsock" in rc.conf.local put a statically linked bgpctl binary in the chroot /path/to/bgpctl -s /bgp.rsock, $ impressions from road to ipv6 most heinous checkin message yet. The lower 2 bytes of the scopeID overwrite part of the v6 address...ugly! Performance http://hasso.linux.ee/linux/openbgpd.php it's quick openBGPD 3.6 port for linux; can't communicate with kernel, no v6, no md5; 8 times faster than quagga. future plans and ideas the biggest task waits outside bgpd itself; kernel routing table. we need to make use of the radix mpath capabilities added in 2004, and add route source markers (BGP, OSPF, etc) bgpd and ospfd can blindly install their routes kernel then knows precedence hard to do, once it's done, routing will be easier. Also need multiple routing tables, with pf acting as table selector so unholy route-to can died, and associated issues vanish/ might be useful with bgpd as well. iddeas for quite radical changs, speed up packet forwarding dramatically. will have fast path where all easy cases can be handled on specialized PCI cards multiple 10GE at wire speed within 2 years. hardware exists, on way to him. for route servers, reversing filter and best path selection would be good. filter generation from RIPE DB or similar but IRR toolset sucks hairy moose balls should be solvable in perl "someone" has to code it. (maybe use IRR power tools for it instead!) [ we can fail over IP addresses already, thanks to CARP we can hve synchronized state tables on multiiple machines, gives HA firewall clusters. Would be really cool to be able to fail over TCP sessions and bgp sessions. could make for BGP hitless failover syncging BGP stuff shouldn't be too hard lots of work, not much time. Money has to come from somewhere, obviously. Unfortunately, people forget about this, just go to mirrors. Vendors don't help Never got anything for OpenSSH yet it com
NANOG36-NOTES 2006.02.14 talk 7 Randy IRR routing security revisited
Many apologies...I'm no Stan Barber, but still doing my best to keep up with the note-taking. ^_^;; Matt Slides are on Randy's site at http://rip.psg.com/~randy/060214.nanog-pki.pdf What I want for Eid ul-Fitr Randy Bush randy at psg.com Definition of Eid ul-Fitr; end of Ramadan; breaking of the fasting period, and of all evil habits. Roughly October 24th this year. 10 years ago Randy plead for people to use IRR; he gives, it didn't work, it has bad data, it doesn't work. Let's get rid of it. Routing security is what we need. Routing security gap assume router has been captured. routing security (not router) is a major problem. http://rip.psg.com/~randy/060119.janog-routesec.pdf need PKI, storing and passing and signing certificates. Public Key Infrastructure PKI Database RIR Certs ISP Certs End Site Certs IP Addresss Attestations ASN Attestations IP and AS Attestations specifies identity == pyblic ckey of recipient signed by allocator's private key Follows allocation hierarchy IANA (or whomever) to RIR RIR to ISP ISP to downstream ISP or end user enterprise IP allocation example IANA to RIR S.iana (192/8, rir) RIR allocatees to ISP S.rir(192.168/16) and so on down the chain. Each chain uses the private key to sign the certs to hand down the chain. ISP/End-site-certs May be acquired anywhere. Don't have to be chained to a single master organization, and can use the same one for multiple RIRs, orgs, etc. RIRs can issue as a service for members who don't get them anywhere. They need no attestation because they are only used in business transactions where they are exchanged and managed by contract, or Bound to IP or ASN attestations by the RIRs or upstream ISPs. Big ISPs may use an ARIN identity for an APNIC allocation or business transaction. Since the keys are acquired separately, doesn't matter where the certs come from, or where used. RIR Identity similar. it's their public key can get it from 'above', RIR< NRO, IANA, or they can even self cert. No provision for revocation, however. PKI Interfaces/Users Nice slide showing the interrrelationships; go see the slides for it, I won't try to render it in ASCII in realtime. The certificates are directly exchanged as part of the business transaction when goods (IPs, ASNs, etc) are exchanged. Goal is to have formally verifiable route attestations, so want replicas of data near routers to be used to determine validity of route origination and propagation. Transacting with PKI RFC2585 descripts FTP and HTTP transport for PKI no need for transport security! Tools for RIRs Generate and receive ISP certs Receive ASN and IP space attestations from upstairs Tools for ISPs generate/get certs register role certs generate certs for downstreams sign allocations to downstreams Open Issues Coordination of updates one central repository not feasible LDAPv3 RFC3377 and RFC2829 for authentication Cert/key rollover and revocatoin not covered May require a separate and secured communication channel NSF via awared ANI-0221435 Steve Bellovin & JI >From microphone, are there TTLs on certs? Yes, which is why ISP certs are separated out. Addresses from ARIN are only "yours" as long as you keep paying ARIN. Tie certs to contract terms. But the ISP identity cert is yours, nobody else should have control over rollover and expiration. APNIC is working to have web pges Andrew Dole, Boeing; how to get funded--Randy will take cash donations. Andrew thinks it'll take 10 million to get the ball rolling. Randy doesn't think that's the problem. The operator community would prefer to see a rigorously correct and verifiable solution with reasonable security infrastructure rather than one more hack on the IRR. Second question. What is forum to discuss and nail down the details? He'll be at APNIC in 2 weeks; for this region the ARIN meeting in Montreal, and this meeting is good too. Nobody seems to be sure where the right place to do this is. But Randy thinks the important part is to SEND the message, that there is a valid path. Vince Fuller. Soliciting input from this group is a good thing, but be more targetted. Figure out why the previous efforts failed, and target them. Chris Morrow, Ted Seely...Randy targets some specific people in the audience. Chris Morrow notes that one challenge he faces is being able to verify if filters are correct. Randy notes the ROUTER will verify the validity itself. Chris feels doing it in OSS system is safer. RS--how do you deal with crufty stuff? RIRs and community will have to deal with that, he's just talking about giving tools to make it possible. Sandy Murphy, Sparta--Randy, you've said there's no prefix lists needed for this; but this could be used for building filter lists, or checking updates, or for tracking customers who call in with issues, etc. this is a first step for a whole BUNCH of things. So no matter what else we want to build on top of it, this really is the first level of the fou
NANOG36-NOTES 2006.02.14 talk 4 Flooding via routing loops
2006.02.14 talk 4 Flooding attacks Jianhong Xia A new talk added right before lunch by Randy Bush will push us to 12:25. Two talks coming up about DoS attacks against control information Flooding Attacks by exploiting persistent forwarding loops. Introduction: routing determines forwarding path. Transient forwarding loops happen all the time during convergence; that's normal. But this focuses on persistent fowarding loops. why would persistent loops exist? Example on neglecting pull-up routes. Router announces 18.0/16 to internet router A has default pointing to B router A uses 18.0.0/24 only Any traffic to 18.0.1.0-18.0.255.255 will enter the forwarding loop between A and B Risk of persistent forwarding loops can amplify based on ttl of packets injected into the looping pair of routers. Can create a denial of service by flooding the upstream links between routers in front of host they want to knock off. any other hosts behind that link are "imperiled addresses" Measurement Design: balancing granularity and overhead samples 2 addresses in each /24 IP block Addresses space collection addresses covered by RouteView table de-aggregate prefixes into /24 prefixes fine-grained prefixes data traces traceroute to 5.5 million fine-grained prefixes measurement lasts for 3 weeks in sept 2005 Almost 2.5% of routable addresses have persistent forwarding loops Almost .8% of routable addresses are imperiled addresses. Validating these persistent forwarding loops from multiple places from asia, europe, west and east cost of US 90% of shadowed prefixes consistently have persistent forwading loops Validation to multiple addresses in shadowed prefixes sampling 50 addresses in each shadowed prefix 68% of shadowed prefixes shows that... Properties of the loops How long are the loops? 86.6% of loops are 2 hops long 0.4% are more than 10 hops long some are more than 15 hops location 82.2% of persistent loops happen within destination domain implications significantly amplify attacking traffic can be exploited from different places. (oops. Matt gets paged out to deal with issue, so no more notes for a while).
NANOG36-NOTES 2006.02.14 talk 3 Flamingo Netflow Visualization Tool
2006.02.14 talk 3 Flamingo netflow visualization Manish (from BGP Inspect project from Merit) bgpinspect.merit.edu:8080 He'll be talking later at the Tools BOF as well apparently. Introduction: What is Flamingo? Visualization The Flamingo Tool combining visualizations with controls Case Studies traffic anomaly network scans worm traffic P2P traffic the slashdot effect. The tool has been under development for a year; John, in audience, and Mike (now employed) have been working on it as undergrads. It's just a view into netflow, no filters or adjustment of data it's just a visualization system. client/server architecture a single server can support multiple clients Visualation methods 5 different views extended quad tree implementation volume by src/dst IP prefix volume by src/dst AS Basic quad tree represent 32bit IP address into fixed space. 4 areas representable by 2 bits. Keep splitting 16 times, you represent 32 bit address in 2D mapping. convert it into 3 dimension, have an axis of freedom to represent additional info. So one side is the quad tree, the Z axis is volume of traffic, so you can see relative volumes. nice slide showing visualization of the traffic flow patterns. Can show traffic flows aggregated by src/dst IP; now there's 2 surfaces needed on the cube, so they use line thickness between the surfaces to show flow sizes between ASes. last visualization incorporates port info as well But since there's only one axis left; so now port level info is on z axis. so IP/port is X1Y1Z1; same for dest IP and port. Once there are coordinates, the line can be drawn, scale the width based on the volume, and now you have the full info in one view. Same colour used to represent traffic from the same source ntuple. combine 2D and 3D representation of data to help keep yourself oriented. They have text representatiosn of information, same as visual data, but in text form. Slider bars allow thresholding of what gets displayed, to prevent clutter; only over a certain size, or only certain ports, etc. Can also apply labels to help pull information out for fast refrence. You can also restrict the address space you care about to only look at certain subnets. Case study: Traffic anomaly sunday Oct 16, 2005 large burst of traffic from one host at umich, lasted 6 hours, four targets, not widely distributed, it was UDP traffic. Was visible in normal view. from 12pm to 6pm. visible on main view, zoomed in, and the 4 million flows show as a huge block. going to src/dest view lets you see where the traffic is going. adding the port info, and you see the entire port space is being sprayed. Another case study--worm traffic doing port 42 scans a fan view on the graph, highly visible. An artificial case study, a host scanning a /24 subnet SSH scans also show up as many many ports probing a single port; a reverse fan. Slashdot effect on campus Oct 31 2004; have before and during pictures showing the huge traffic swing. Zotob worm infection; random destination IPs, but same port, coming from same host, cone effect. P2P traffic; single host with multiple connections to different destinations, significant volume to each. Darkspace traffic visualizations show nothing but scans, show up really dramatically. Conclusion The Flamingo Visualization Tool provides users with the ability to easily explore and extract meaning information regarding traffic flows in their network. More will be discussed at the Tools BOF this afternoon. http://flamingo.merit.edu/ Break now, come back at 10:50. Someone left a jacket at the Yahoo party with a digital camera; describe it to the registration desk to get it back.
NANOG36-NOTES 2006.02.14 talk 2 Netflow Visualization Tools
2006.02.14 talk 2 Netflow tools Bill Yurcik byurcik at ncsa.uiuc.edu NVisionIP and VisFlowConnect-IP probably a dozen tools out there, this is just two of them. Concenses is there's something to this. They're an edge network, comes into ISP domain, their tools are used by entities with many subnet blocks. Overview Project Motifivation Netflows for Security Two visualization tools NVisionIP VisFlowConnect-IP Summary Internet Security: N-Dimensional Work Space large--already lots of data to process complex--combinatorics explode quickly time dynamics--things can change quickly! Visualizations can help! in near-realtime overview-browse-details on demand People are wired to do near-realtime processing of visual information, so that's a good way to present information for humans. HCI says use overview-browse-details paradigm. Netflows for security can identify connection-oriented stats to see things like attacks, DoS, DDoS, etc. Most people don't use the data portion of the flow field, the first 64 bytes, they just look at header info or aggregated flow records. Can spot how many users are on your system at a given time, to schedule upgrades. Who are your top talkers? How long do my users surf? What are people using the network for? Where do users go? Where did they come from? Are users following the security policy? What are the top N destination ports? Is there traffic to vulnerable hosts? Can you identify and block scanners/bad guys? This doesn't replace other systems like syslog, etc.; it integrates and works alongside them. architecture slide for NCSA. Can't really do sampled view for security, so probably need distributed flow collector farm to get all the raw data safely. Two visualization tools: NVisionIP, VisFlowConnect-IP focus on quick overview of tools security.ncsa.uiuc.edu/ 3 level hierarchical tool; galaxy view (small multiple view) ((machine view)) Galaxy is overview of the whole network. color and shape of dots is each host in a network. settable parameters for each dot. Animated toolbar and clock show changes over time in the galaxy. Lets you get high-level content quickly and easily. Domain view lets you drill in a bit more; small multiple view looks at the traffic within the block. upper histogram is lower, well known ports; lower histogram is ports over 1024 You can click on a given multiple view entry to delve into one machine. Many graphs for each machine in the most detailed view. well known ports first, then rest of ports (sorted) then source and destination traffic broken out. Designed for class Bs. http://security.ncsa.uiuc.edu/distribution/VisFlowConnectDownload.html 3 vertical lines, comes from edge network perspective; middle line is edge network to manage. You set range of networks you care about. Outside lines are people sourcing or sinking traffic to you, from outside domains. There's a time axis, traffic only shown for the slice of time currently under consideration. Uses VCR-like controls to move time forward/backward Lets you see traffic/interactivity, drill into that domain, see host level connectivity flows. Shows MS Blaster virus traffic as an example. Example 2, a scan example. Just because it looks like one IP hitting many others doesn't mean it's really a security incident, though; could be a cluster getting traffic. web crawlers hitting NCSA web servers make for a very charateristic pattern over time. Summary Netflows analysis is non-trivial, NVisionIP VisFlowConnect-IP lots of references listed in very fine blue font. http://security.ncsa.uiuc.edu/distribution/NVisionIPDownload Avi Freedman, Akamai, Argus was mentioned a lot; it lets you grab symmetric netflows, but also does TCP analysis, shows some performance data as well. not sure if people are studying the impact of correlating argus data with flow data. Roland Douta? of Cisco; many people are using netflow to track security issues. They now have ingress and egress flow data on many of their platforms. In reading paper describing it, there's data conversion that needs to happen into an internal format that nVision can understand. It reads log files at the moment, takes about 5 minutes to process files. Lets them take different file data sources, make the tool for visualization independent of the input format. They can read large files, but there is a performance hit when doing it. Are they planning on doing further work on the tool to collect TCP flags, for frags, drop traffic, etc? They've looked at it, but they leave it to IDS tools for flag activity. Might be of interest to consider for future versions of the tools. Last question came up, echoed about argus. Question about interactivity, they are working on feedback through tools. Question about alarming on patterns; but once you start alarming or putting up visual indicators, it distracts from rest of the overall pattern, you tend to miss other information.
NANOG36-NOTES 2006.02.14 talk 1 IRR power tools
Apologies in advance, notes from this morning will be a bit more scattered, as I was working on an issue in parallel to taking notes. Matt 2006.02.14 talk 1 IRR Power Tools 12:10 to 12:25, extra talk added, not on printed agenda. Thanks to those who submitted lightning talks. PC committee members are doing moderation, Todd Underwood will be handling the first session this morning. There will be 3 talks about tools for operators 1 IRR and 2 Netflow tools. Be thinking of interesting questions to ask. Todd has to introduce RAS at 9am, 7am west coast time which is normally his bedtime. IRR power tools, Dec 2004 first generation re-write. IRR--a quick review People have been asking him "why do we need the IRR?" Any time you have a protocol like BGP that can propagate information, you need some form of filtering in place to limit damage. IRRs are databases for storing lists of customer information. Written to speak RPSL some speak RPSLng. RADB ALTDB VERIO, LEVEL3, SAVVIS RIR-run databases: ARIN, RIPE, APNIC, etc. IRRs better than manual filtering. huge list on the slides. Filtering is needed, and hard to keep updated by hand. Why doesn't everyone use IRR? Many people do In Europe, pretty much total support in Europe; it's required by RIPE, providers won't deal with you if you don't keep your entries up, large exchanges likewise check. Few major networks in US use IRR too: NTT/Verio Level3 Savvis Most people don't. Why doesn't everyone use it? In US, it's too complex for customers. support costs go up if you have to teach customers. Networks don't like to list their customers in a public database that can be mined by competitors RAS figured he could fix at least one piece Wrote a tool to help with: automatic retrieval of prefixes behind an IRR object automatic filtering of bogon or other undesirable routes Automatic aggregation of prefixes to reduce config size Tracking and long-term recording of prefix changes Emails the customer and ISP with prefix changes Exports the change data to plain-text format for easy interaction with non-IRR enabled networks Generates router configs for easy deployments. Doesn't do import/export policies, doesn't do filter-sets, rtr-set, peering-set, etc. Just focuses on essential portions. Tool was written around IRRToolSet initially, but the C++ code didn't compile nicely. This isn't a complete replacement for IRRToolSet, but provides the basic functionality A few conf files: IRRDB.CONF EXCLUSIONS.CONF NAG.CONF ./irrpt_fetch grabs the current database info It also speaks clear english on add/remove of prefixes for access lists; default format is english, but you can change it to diff format. ./irrpt_pfxgen ASNUM generates a prefix list suitable for the customer interface. Can use -f juniper to create juniper filters. http://irrpt.sourceforge.net/ Always looking for more feedback; it's been deployed by a few people in the peering community; this will be its first widescale announcement. Future plans: Add support for IPv6/RPSLng needs IPv6 aggregation tools RADB tool uses a faster protocol, RIPE just breaks down one level; you have to do multiple iterations to get the full expansion. Servers tend to time out before you can get all the answer; RIPE servers have hard 3 minute timeout that closes the socket. Add SQL database support for a backend Convert from a script to a real application IRRWeb -- http://www.irrweb.com/ He'll talk about irrweb at next nanog. Allow end users to register routes without needing to know ANYTHING about RPSL You can play with it, register routes, but it doesn't publish anywhere. That's it--happy valentine's day! Richard A Steenbergen ras at nlayer.net Susan notes that RADB is developed by Merit, the two primary developers are here today Chris Fraiser, main cust interface now Larry Blunk is RPSLng person, also here today. Right now, no mirroring between IRRs, you have to mesh with everyone else when a new IRR comes up. RADB at least does pick up from the others, so right now RADB is the best spot to do your queries against. Todd asks about filters; does it do prefix list only, or prefix list plus as-path? It builds off as's behind other as's, which might not be the best model; latest code is starting to understand as-sets. To do it properly, you might need import/export policy support. Randy Bush, IIJ. Like IPv6, this meeting marks the tenth anniversary of Randy pushing for IRR adoption. And like IPv6, adoption rate has not been going well. What's wrong? Pretty much too complex, which is why this effort is to make it much simpler, to try to get more uptake in the US. Todd notes that 2 things; 1, tools are too difficult; this addresses that. second piece is that in US, allocations aren't tied to registry entry creation; this won't solve that part at all. For the second part, the benefits are seen mostly the closer you are to the registration process. Anyone can register any block; and if you don't use
NANOG36-NOTES 2006.02.13 talk 7 QoS in MPLS environments
Here's my notes from the MPLS QoS tutorial; wish I could have been in two places at once to catch the ISPSec BOF as well. I won't be taking notes at Eddie Deens, though, so it'll be up to Ren's camera to capture the details for those following along at home. < http://nanog.multiply.com/ > Matt 2006.02.13 QoS in MPLS networks tutorial notes. See notes for Agenda, outline, etc. at http://www.nanog.org/mtg-0602/sathiamurthi.html Traffic characterizations go beyond simple DiffServ bit distinctions Understand traffic types and sources and nature of traffic before Latency, Jitter, Loss three traffic parameters to be tracked that influence choices made when applying QoS It's all about managing finite resources rate control, queing, scheduling, etc. congestion management, admission control routing control traffic protection The QoS Triangle (no, not bermuda triangle) Identify Traffic Type Determine QoS parameters Apply QoS settings 2 approaches to QoS fine-grained approach or combination of flows to same traffic type, to same source. Needs to have same characteristics so you can consider them as an aggregated flow. Best Effort is simplest QoS Integrated services (Hard QoS) Differentiated Services (soft QoS) Best Effort is simple, traditional internet Integrated services model, RFC 1633, guarantees per flow QoS strict bandwidth reservations. RSVP, RFC 2055, PATH/RESV messages Admission controls must be configured on every router along path Works well on small scale. Scaling challenge with large numbers of flows. What about aggregating flows into integrated services? DiffServ arch; RFC 2475 scales well with large flows through aggregation creates a means for traffic conditioning (TC) defines per-hop behaviour (PHB) edge nodes perform TC keeps core doing forwarding tough to predict end to end behaviour esp with multiple domains how do you handle capacity planning? Diff services arch slide with pictures of traffic flow. TCA prepares core for the traffic flow that will be coming in; allows core to do per-hops behaviour at the core. IETF diffserv model redefine ToS byte in IP header to differentiated services code point (DSCP) uses 6 bits to define behaviour into behaviour aggregates. Class Selector (CS0 through CS 7) classifier; selects packets based on headers. Classification and Marking flows have 5 parameters; IP src, dest, prececedence, DSCP bits, You can handle traffic metering via adjusting the three flows. 3 parameters used by the token bucket; committed information rate conformed and extended burst size Policing vs shaping. policing drops excess traffic; it accomodates bursts; anything beyond that gets dropped; or, can be re-marked. Shaping smooths traffic but increases latency. buffers packets. policing uses the token bucket scheme tokens added to the bucket at the committed rate depth of the bucket determines the burst size packets arriving when there's enough tokens in the bucket are conforming packets arriving when the bucket is out of tokens are non-conforming; either coloured, dropping, etc. diagram of token bucket, very nice. shaping--use the token bucket scheme as well smooths through buffering queued packets transmitted as tokens are available. 1 aspect is traffic conditioning at edge 2 aspect is per hop behaviour PHB relates to resource allocation for a flow resource allocation is typically bandwidth queing / scheduling mechanisms FIFO/WFQ/MWRR(weighted)/MDRR (deficit) congestion avoidence RED (random early detection / Weighted random early drop Queing/scheduling needs some data mining to decide how to prioritize certain classes of traffic. de-queues depends on weights assigned to different flows. Congestion avoidance technique when there is congestion what should happen? tail drop (hit max queue length) drop selectively but based on IP Prec/DSCP bit Congestion control for TcP adaptive dominant transport protocol Slide showing problem of congestion; without technique, have uncontrolled congestion, big performance impact due to retransmissions. TCP traffic and congestion congestion vs slow-start sender/recieever negotiate on it. source throttles back traffic. (control leverages this behaviour) Global synchroniztion happens when many flows pass through a congested link; each flow going through starts following the same backoff and ramp up, leads to sawtooth curves. RED a congestion avoidance mechanism works with TCP uses packet drop probability and avg queue size avoids global synchronization of many flows. minimizes packet delay jitter by managing queue size RED has minimum and maximum threshold; average queue size is used to avoid dealing with transient bursts. WRED combines RED with IP precedence or DSCP to implement multiple service classes each service class has its own min and max threshold and drop rate. nice slides of lower and higher thresholds for different traffic types. When is WRED used? only when TCP is bulk of traffi
NANOG36-NOTES 2006.02.12 talk 5 IPv6 --fear and GOSIP in Dallas
Apparently the video feed is of very good quality this time around--many thanks to Brokaw for the good bandwidth to the hotel! Last set of notes before lunch. Matt 2006.02.12 NANOG IPv6 transition panel panel member briefs at http://www.nanog.org/mtg-0602/golding.html IPv6: time for transition, or just more GOSIP? GOSIP was initiative to use OSI networking throughout the government. 5 participants Joe Houle ATT Jared Mauch NTT America Wes George, Sprint Jason Schiller, UUNet/Verizon Fred Wettling, Bechtel Tried to get government people, since they went v6, but they're not forthcoming with details; you know how government people are. :D Daniel Golding, The burton group Joe Houle, ATT is up first. Emerging service for ATT for L2/L3, IP private networking, v6, etc. fall under his baliwick. He'd count himself as pragmaticlly pro; IPv6, why now? He does believe we're running out of IPv4 addresses. NATs and non-unique addresses make offering quality services difficult. Convergence doesn't work well over NAT'd addresses. why governments? US government doesn't want the have-have-not split to continue; the v6'ers may be the "have" side and we don't want to be on the have-not side. NTT America (AKA 2914) Native dual-stack IPv4/IPv6 since fall 2003 Cisco 7200, 7500, "76k" Juniper M series, T-series Wes George, Rob Rackell hat, couldn't be here due to weather. Pro v6, looking at it with skepticism. Sprint close to center of v6 world. 200pps on v6 network. Internet doesn't use v6 for real yet this is not the movie as the ISO fun; this time the government is paying! IPv6 is something that US carriers can make money on in the VPN space It is not valuable as an internet transport yet spend less time marketing about how cool it is, and go fix the issues!! multihoming, micromobility, SHIMv6 is a host solution. This time around, the government is paying. They don't know exactly what they want, but they know they want it. hoping carriers will figure it out and tell them. Jason Schiller, UUnet/Verizon. public v6 roadmap. AS284 US/AS12702 EMEA/AS18061 AsPAC) for v6 only Over network utilizeing GRE Phase 2 6PE solution in AS701 dual stack v4/v6 on edge mail, DNS support later phase 2a upgrade exising non-6PE capable edge routers 2007, phase 2b native v6 in the core (maybe) Problem is, no money yet in v6, so can't roll out aggressively at all. But if no money, why put it in the core? Well, to be ready in case it DOES take off in the future. Fred Wettling, Bechtel--large enterprise, also with v6 business council. Bechtel Telecoms (A & C for big carriers like Sprint, ATT, etc). Interested in non-traditional transport of IP services. shift in plant automation networks from proprietary to IP; so want to be ahead of the curve on it. Bechtel's internal test started last year, will be deployed out to 40,000 by this year; a bit of the chicken and egg issue, go back to 1995, IE v1 vs today; things will progress, things will take off, the goal is to be ahead of the curve. Daniel Golding, host for the panel. Question 1: Why IPv6, why now? Why are you implementing v6, other than it's cool? Is it address exhaustion, new capability, Gov't RFP requirements, vendors pushing new hardware? Jared notes they rolled it out in 2003 due to global pressures; they wanted to keep a unified network model worldwide, and as a subsidiary of a japanese company, and the largest player in that space, combined with government mandates, really pushed them in that space early. It _is_ a technical cool thing, it's good to be a market leader. Jared notes that they've been running dual stack v4/v6, it just works. ATT VoIP has been a driver, just doesn't work over NAT, so what other solutions are there? Really, address exhaustion, non-unique addresses propagating throughout space is just putting roadblock after roadblock in front of convergence. Dan asks why do we need NAT--we're not OUT of v4 addresses yet; Joe notes that people are really using NAT as a security mechanism right now, more so than really worrying about conserving address space. Yes, it's bogus, but it's what people have been sold on right now, so it gets widely used. Jared pitches in and notes that the push for encapsulation of everything encapsulated over port 80 is getting more and more widespread. People are attempting to use "firewalls" and "NATS" to give themselves the notion of security, even though most infection rates now are coming from other vectors (spyware, infected email, etc), rather than outside probing. Dan notes we don't need to do NAT, they can go to their upstream, to ARIN. But ARIN frowns on using public space for private use? Bechtel notes they're running into more and more problems as they try to get companies to do joint ventures, as every company uses 10.x space, and they have to do NAT over NAT, it's evil. He's also an IMOD (infrastructure modernization) player, it's a 4 billion dollar upgrade for the military, and it ha
NANOG36-NOTES 2006.02.13 talk4 DNS infrastructure distribution
2006.02.13 Steve Gibbard DNS infrastructure Distribution Steve Gibbard Packet Clearing House http://www.pch.net/ scg at pch.net Introduction Previous talk on importance of keeping criticical infrastructure local Without local infastructure, local communications are subject to far away outages, costs, and performance Critical infrastructure includes DNS If a domain is critical, so is everything above it in the hierarchy Sri Lanka a case in point. Previous talk was in Seattle last spring, highlighted undersea cable being cut; even local DNS queries failed since TLD servers couldn't be reached, even though local connectivity still worked. The ship dragging anchor in harbor cut only undersea path out of the country; international calling was down, and all of the Internet. But unlike local telephone system, even local networks failed to work. Root server placement Currently 110 root servers(?) Number is a moving target Operated by 12 organizations 13 IP addresses at most 13 servers visible from any one place at any one time six are anycast four are anycasted in large numbers All remaining unicast roots are in the bay area, LA, or washington DC Distribution by continet 34 in NA 8 each in BA/DC/ 5 in LA Only non-coastal roots in US are Chicago and Atlanta canada, monterrey, mexico some others 34 in Europe clusters of 4 each in London, and amsterdam, Europe's biggest exchanges even throughout rest of europe for rest. Distribution by continent 26 in Asia (excluding middle east) 5 in japan (4 tok, 1 kyoto) 3 in india, korea, singapore 2 in hongkong, jakarta, and beijing south asia an area of rapid expansion 6 in australia/new zealand 2 in brisbane 1 each in auckland, perth, sydney, and wellington 5 in middle east 1 each ankara, tel aviv, doha, dubai, abu dhabi 3 in africa 2 in johannesburg 1 in nairobi, 1 more being shipped very little intercity onr intercountry connectivity 2 in SA sao palo santiago de chile other parts of world not really served at all. world map with blobs showing coverage. Huge areas not covered. overlaid fiber maps with dots to get ideas of coverage (redundant); everyone else is one fiber or satellite cut from being isolated and dark. Pretty much follows the areas with money. Root server expansion 4 of 12 root servers actively installing new roots 110 root servers big improvement over 13 from 3 years ago two operators (autonomica, ISC) (I and F) are installing wherever they can get funding funding sources typically RIRs, local governments, or ISP associations Limitations in currently unserved areas are generally due to lack of money Fs and Is In large portions of world, several closest roots are Is and Fs At most 2 root IP addresses visible; others far way Does this matter? gives poorly connected regions less ability to use BINDs failure and closest server detection mechanisms non-BIND implementations may default to far-away roots Should all 13 roots be anycasted evenly? CAIDA study from 2003 assumed a maximum of 13 locations; not really relevant anymore Big Clusters Lots of complaints about uneven distribution Only really a concern if resources are finite Large numbers in some places donesn't prevent growth in others Bay Area and DC clusters seem a bit much, but sort of match topology Western Europe's dense but relatively even distribution exactly right Two per city perhaps a good goal for everywhere TLD distribution Like the root, locally used TLDs need to be served locally Locally used TLDs: local ccTLD; any other TLDs commonly in use Regions don't need ALL TLDs. gTLD distribution: .com/.net .com/.net well connected to the "internet core" servers in the big cities of US, Europe, Asia non-core location: sydney. Map of world with .com/.net overlaid with fiber maps shows "well-served areas" again following the money, with even less coverage outside NA/Europe/Asia. gTLD dist: .org/.info/.coop share same servers considered confidential. data may be incomplete significantly fewer publically visible servers, almost all in internet core. only one public locatino in each of asia and europe Even worse coverage worldwide, though they do have south africa. Do have some caching boxes next to caching resolvers at the big ISPs; not sure if it increases coverage or not. Few other gTLDs, didn't map them. .gov is us-centric .edu is US, some eu, some asia .int is california, netherlands, UK (not very international!!) Where should gTLDs be? presumably depend on their markets if it's ok for large portiions of the world to not use those gTLDs, then it's OK for them to not be hosted there. ccTLD dist: answers to where ccTLDs should be more straightforward working in their own regions a must working in the "core" could be a plus just over 2/3 of ccTLDs are hosted in their own countries (but a lot of those aren't ... Green map shows those countries that host their own ccTLDs locally. Most islands are red, in danger of bei
NANOG36-NOTES 2006.02.13 talk 3 NTT labs AAAA query explosion worries
(Huge apologies in advance for any and all names I completely mangle! check http://nanog.multiply.com/ to see names/faces correctly handled by Ren. ^_^; ) Matt 2006.02.13, talk 3 NTT labs, (Steve Feldman apologizes for mangling the pronnounciation of their names). NTT information sharing platform labs (didn't get names/info from opening slide) Outline Expect increase in number of DNS queries this year Discussion effect on cache server load and user response time how can we decrease number of unnecessary queries? Today's topic we focus on increase in number of queries between users and cache servers caused by IPv6 support number of 4A queries same as that of A queries domain name completion function DN completion by OS DN completion by application IPv6 enabled OS increases 4A queries Vista will be v6 enbled by default IPv6 and OS resolver IPv6 enabled OS sends 4A queries for every name resolution BSD/Windows Sends both A and 4A queries for every name resolution currently no way to disable one or the other Domain Name Completion when a name resolution fails, both OS and APP automatically try different prefix/suffix completions. OS using these domains to complete: FreeBSD: specified by "search" in /etc/resolv.conf, distributed by DHCP Windows: configured in control panel, distributed by DHCP Applications: Mozilla: retries with www domain prefix IE searches domain using MSN search and then retries name resolutions for domains by adding .com, .org, .net, .edu. Convenient for user, perhaps, hard on nameservers. Combination in FreeBSD completions are different depending on OS FreeBSD tried domain completions for A and 4A for each case. Windows tries all 4A records first, THEN tries all A records. So IPv6 queries in Windows means even if there's an A record in v4 space, it exhausts ALL 4A possibilities FIRST, before going back to get A record. Longhorn/Vista IPv6 default enabled ALWAYS tries 4A queries first! IE7 plus Vista results in 12 DNS queries per user click, best case. Worst case, one user click results in 40 DNS queries!! Slide showing projected impact based on historical data plus projected Vista deployment. Right now, 4A queries only about 5% of queries. After Vista, size of increase could dwarf rest of DNS queries. Release of Windows Vista (IPv6 by default) doubles at least the number of user queries causes more queries in domain name completions and domain search sequences Operators cache servers should be prepared for those increases stop domain distribution to users by DHCP or PPPoE Developers of OS is current search order of resolvers appropriate? eg should "A" record be resolved before domain completion. Ed from Neustar, at microphone: before we consider this a problem, consider from point of application provider; when you need a name, you don't know what transport you may have underneath; if you wait for NXDomain, you increase latency, so app developers generally send all queries at once. What about changing DNS to allow asking for multiple questions at once? Changing application behaviour isn't likely to happen, and changing protocols isn't easy; so why not just beef up the infrastructure to handle it? Joel Yagli, UofOregon; do you know how many of those queries will need to fail over from UDP to TCP due to responses being too large to fit into a single UDP response? Most of the responses coming back don't have data, so they don't need to go to TCP. Tony Bates--what happens when v6 record is returned as valid; does the chain stop there? Also, if you flip to return A record first, we'll never to move to v6. We NEED to start resolving v6 records first, to help move the 'Net off IPv4. Applause, on to next talk.
NANOG36-NOTES 2006.02.13 talk 2 Duane Wessels, DNS cache poisoning
2006.02.13 talk 2 DNS cache poisoners Lazy, Stupid, or Evil Duane Wessels Motivation During March/April 2005, SANS internet storm center reported a number of DNS cache poisoning "attacks" were occurring Poisoned nameservers have bogus NS records for the com zone SANS ISC theorizes it may have been a vector for spyware propagation Microsoft windows (most versions) and symantic firewall products are affected. Slides are on the website, BTW. The poisoning attack: an auth nameserver (where queries normally go) is configured to return bogus and out of baliwick NS auth records. caching resolver receives and trusts those bogus referrals future queries for names in poisoned zone go to the bogus NS dig +trace longislandauction.com will show the poisoned NS auth responses NS auth1.ns.sargasso.net. which has NS com. auth1.ns.sargasso.net. so any caching resolvers may consider auth1.ns.sargasso.net authoritative for any unknowns in com zone. Vulnerable implementations: Windows NT (by default, SP4, can tweak it via reg) Windows 2K, (by default, later fixed) Windows 2003 (not by default, but easy to unfix) Symantec gateway firewalls SYM04-010 and SYM05-010 to Yahoo search and find more. How to find poisoners? start with a large list of DNS names or zones discover set of auth servers for the zone by following referrals on down from root query each auth nameserver compare the NS RR set in each reply to the previously-learned referrals for parent zones this technique only finds parent-zone poisoning. February 2006 Survey input list is about 6 million names from nameservers they have access to. Found 284 "poisoning" nameservers; returns bogus NS entries for root or TLD. . has 217 com 49 net 29 org 24 au 3 cc 2 cn 1 to 1 default 1 some nameservers poison more than one zone. List of some poisoners on slide 12. dns.internic.ca ns1.afternic.com ns0.directnic.com ns1.domainsarefree.com etc. Never attribute to malice what can be adequately to be explained by stupidity Many of the nameservers that return bad referrals appear to be companies in the DNS business registrars resellers speculators typo profiteers others appear to be legitimate companies they should know better many of the names leading to poisoners are either expired or parked Is the sky falling? with so many poisoners out there, why don't we hear more about the problem? Most implementations don't allow root to be poisoned If you were surfing the web with poisoned DNS cache, would you know it? let's simulate it... for every bad referral found, we put the nameserver's IP address go to www.google.com go to www.microsoft.com see what you get. bbns01.secureserver.net, for example, happily pretends to be google.com. dns.domainsatcost.ca is amusing, because their ads are from google, even as they hijack it. a.ns.nameflux.com at least does an HTTP redirect dns2.nai.com doesn't return any A record, so you at least know *something*'s wrong. More examples follow... ns1.frakes.net ns.pairnic.com "smart people use pairnic for DNS"... Duane would beg to differ. 65.75.128.178.com returns an amusing message that is clearly wrong, blaming the clients for the traffic the DNS server itself is causing. Lazy, Stupid, or Evil Laziness: ns1.hi2000.com The admin is too lazy to put each domain delegated to them into separate zone files. Instead, they create a com zone and list A records for each delegation. Laziness such as this is probably the source of most of the poison out there. (includes guess at what their zone file looks like) Stupidity: ns1.frakes.net Typos, combined with laziness, create an interesting situation. Looks like Frakes.net is using the com zone technique, but forgot to make the nameservers fully qualified. Note that ns1.com etc are legitimate DNS names and have A records different than those returned by ns1.frakes.net. Just forgot the dot after the trailing name on the NS record. Evilness: Our definition of an evil poisoning nameserver is one where it answers queries with the wrong address, and maybe proxies web traffic sent there so you get what you (mostly) expect. To help find them, give each source of poison an evilness ranking from 1-5, with one point for each issue below: Returning bad referral poisoning a TLD Answering an A query for "important names" Answering query incorrectly Answering the query such that the web browser looks like it *might* be correct DNS A few fours, no fives. Miscellany Some of the poison sources that we find are actually vulnerable implementations that hve been previously poisoned by someone else. Remember: authoritative nameservers should NEVER accept recursive queries!! Some NS records have non-FQDN names. The name "ns" is a popular example. It's a good thing even the vulnerable implementations don't let the root zone become poisoned. Bottom Line: Several hundred misconfigured nameserves out there that return bad referrals that can poison DNS caches About 75% try to p
NANOG36-NOTES talk 1--steve feldman
Based on generally positive feedback from many people, I'll be posting my notes from the conference. I'll preface the subject line with NANOG36-NOTES, so if you want to mass-skip the thread, it should be easy to do so. 2006.02.13 NANOG36 day 1 Opening/welcome to Dallas Steve Feldman starts off--many people are still trapped out east unfortunately. Steve Feldman, Program Chair, CNET networks. Texas Compact Car == SUV Our Host: Brokaw Brian Mike Todd Parker Raj Patel Brad Parker Thanks to NANOG program commmitted List of them went too quickly Agenda Changes--Tuesday 12:10--12:25...went too fast to see. Agenda Changes Wednesday 9:30-10 Hurricane Katrina: Telecom infrastructre, impacts, solutions, and opportunities... (more on slide) Reminders Network Security con't use cleartext passwordss do use end-to-end encryption (ssh, VPN) PGP Key signing see link off .nanog.org for details Yahoo Reception Bear and Gear reception Interpreting Badges. Blue -- steering committee Yellow -- program committee green badge -- yellow plus blue, both committees green dot: peering black dot: security red dot: PGP signer RED badges will be for mailing list panel members Lighting Talks six 10-minute slots available Criterion: on-topic for mailing list Signups start now! http://www.nanogpc.org/lightning Random acceptance of submissions made before 2pm Monday Submission order after that (if slots remain) That's it for Steve Feldman, next up is Brokaw Price from Yahoo. Welcome to Dallas, on behalf of Yahoo. Not many hotels in Dallas that can host a group of this size with two ballrooms (one for general session, one for beer and gear; and the Hyatt was the only other one with space, and they're booked since Katrina wiped out other conference spaces in the south.). There's a trolley one block over that will take you to the downtown restaurant areas, it's free, feel free to take it and explore the area, note it stops running around 9:30pm. It does quiet down after dark in downtown, unfortunately. One thing really needed for a NANOG is a really good sized Internet link. The hotel had a pair of T1s to start with, and they were getting a second pair when Yahoo began pulling the fiber in for NANOG. Terminal room is now virtual; feel free to use the laptops and printers should you need to print documents, boarding passes, etc. The laptops are cabled down, but if you need to borrow one, just track Brokaw down, and he'll take care of you. He's been feeling like he's part of the garment industry in doing all the gear (two sets of tee shirts, and the fleeces for people who peer with Yahoo--if you don't already peer with Yahoo, jump in and send us [EMAIL PROTECTED]). There will be an awesome party tonight at Eddie Deens, everyone should make sure to attend--it'll be fun, Texas style, and Texas sized. :) Many thanks to Mike Gallagher for doing the NANOG36 specific website with details on the local area and local options for attendees. Betty Burke asked him to say a few words about what it takes to put on a conference like this; in many ways, it's been like being a huge wedding planner, only weddings don't need large internet connectivity. Start planning early!! Brokaw thanks the Merit people for being so supportive; they've been complete animals, biting into the details with gusto; they're like true roadies, getting gear packaged and shipped, audio gear, video gear, cables, power, everything. Many thanks to Larry, Betty, Chris, Dave, Susan, Laurie, Dwayne, Steve, Tony, Tom, Greg, SC, PC, everyone else, it's all been completely worth it! It's definitely exciting times--we're building something huge; traffic levels are growing at near-exponential levels, datacenters are rolling out faster and faster. It's great being part of this community, and we all need to help keep it alive, to nurture it and helping it grow. If you haven't hosted a NANOG yet, definitely consider it; it's an interesting process. It starts off with "What's NANOG?" "What's the ROI on a NANOG?" Dan Golding joking about having a PBS-style thermometer graph showing how much new peering we get each day, to demonstrate the ROI for hosting a NANOG. But really, it's about sharing the support for the community--it's about stepping up to the plate, and saying "it's our turn to pitch in." There's 26 inches of snow in central park, which is keeping many, many of our colleagues away; and if you DO host, try to avoid Valentine's Day!! If you have any questions about hosting, feel free to call us. The good people at ATT have been great working with us. The fiber coming into the hotel terminates in the parking garage, but that was about 60 feet from where it needed to be. December 23rd, discovered the shortfall, and discovered how to pull innerduct in through a hotel on short notice. No matter what else happens for NANOG Huge thanks to Yahoo crew, especially Mike Gallagher, Brian Lacroix, Todd Parker, Raj Patel, Brad Parker, The whole ATT cr
Any interest in notes from the talks at NANOG?
Since there are several attendees that are snowed in and won't be able to make it to Dallas for NANOG, I was thinking of posting my notes from each presentation to the nanog list, so those who are stranded can follow along from home. Would that be of interest to the list, or would it be just so much more useless spam-like fodder to be deleted? So far, the response to my notes from last night's community meeting has been positive--but that's only 2 people, so I don't know if I've helped 2 people, but alienated 8,036 more in the process. ^_^; Let me know if you'd be interested in having me post the notes. Thanks! Matt
2006.02.12 Open Committee Meeting Notes
I captured some notes during tonight's open mike committee meeting, in case they may be of interest to the list. Apologies in advance for typos, it was hard to keep up with the speakers. ^_^; Matt Steering Committee Report ([EMAIL PROTECTED]) 2006.02.12 1700 hours Central Time. AGENDA Steering Comittee (Randy Bush) Program Committee (Steve Feldman) Financial Report (Betty Burke) Mailing List Report (Chris Malayter) Steering committee report Tryng to hear the membershiip responsible for ML, PC, Lostistics But trying not to micro-manage Establishing normal but minimal business practices Semi-weekly minutes on web site SC Tries to listen Transparency: SC Minutes, ML, Stats, ... Trying Mon-Wed Meeting (instead of Sun-Tues) Newcomers' Session (did it work?) No more Terminal Room (Laptops plugged to printers near registration) Change badge fonts (larger company name/person name) (really a question of size, according to WBN) Suggestion from a nice gentleman at the microphone who says "why not print the badge on both sides, so you don't need to flip badge around all the time? Did NOT change Number of meetings per year Many costs of support are fixed, ie not per-meeting Currently amortized over three meetings If over two meetings, fees would go up significantly Did NOT change Working Lunch Hotels have monopoly on food The charges for lunch are what you would expect from a monopoly But we will try to be more sensitive to ease of getting lunch near or at the meeting venue (not economical to have hotel provide food) Rights in Data NANOG trademark is held by Mertic Presos are copyright by the author Right to freely distribute, but not modify, granted to NANOG PC is drafting this formally copyright notices on slides are OK if small and unobtrusive What does it mean to be a member? Attendance at meetings, and participation in the mailing list is pretty much what defines membership. Program Comittee First change using new process seemed successful Why have you not submitted a talk? Wht do you want to hear? Mailing List Worked the process to fill vacancy left by Steve Gibbard Still working with ML panel to document their process Still working With MP to develop an appeals processs Statistics are published monthly on NANOG web site ML Panel Appointments No Terms, etc. in current charter Straw proposal charger change parallels SC and PC two year terms staggered two sequential terms max without a vacation Please comment, change, propose (Bill Norton, Equinix, Use of nanog-futures to discuss this very type of issue...Randy will get to it) ML panel proces cont. This would give members a light at the end of the tunnel Volunteers would know what they're signing up for Allwos chang without bad vibe of removal Normal organizational practice Chartger Change Octove is the end of the process so start now MLL Panel straw poproasal staring Need to get Steve's name adn other star-upisms removed No other proposals received for this year New Ideas BLOG--no progress Wiki, no progress SlashNOG, no interest Trial of new tech gear at NANOG, nothing exciting Video in hallways, Do you like it? It's back, same size, better location!! By cafe tables near registration area, near where food will be. Traded terminal room for informal breakout rooms, informal seating, allow for more mingling. Ren Provo--http://nanog.multiply.com/, about 100 pictures with names and affiliations, so you can match up faces with names to help newcomers. Mailing Lists Engineering and Ops dicsussion only ([EMAIL PROTECTED]) Discussion about NANOG itself ([EMAIL PROTECTED]) Steering ([EMAIL PROTECTED]) Program ([EMAIL PROTECTED]) ML <[EMAIL PROTECTED]) Fruit supplied, yum! Discussion? How can we make NANOG more useful, fun, informative? Randy gripes the mailing list has gotten boring recently. Cut to Steve Feldman for Program Committee Report. Steve Feldman, CNET, PC Chair. All opinions, mistakes, his, all the good stuff is thanks to the PC. NANOG 36 program 26 submissions (down from 41!!) 22 accepted 1 cancelled 1 withdrawn 2 rejected 2 very late, both accepted Areas for improvement Speaker solicitation Tool improvements self-service submission web interface Reports Program Format Mon-Weds format morning plenaries afternoon BOF, Tutorials Evening social events Newbie meeting is there a better name Tracks? not without more content!! for tutorials, bofs, hopefully not too much overlap or need to be in two places at once. Party tomorrow courtesy of Yahoo! Tuesday, Beer and Gear with sponsors. For Tracks, need sufficient space as well as content. Lightning Talks Criterion: on-topic for mailing list Signups start Monday morning instructions during plenary Random acceptance of submissions made before 2pm Monday Submission order after that (if slots remain) (No personal insults! Stay technical, keep it below 10 minutes) Feedback Talk to us! PC members have yellow and green badges Send mail [EMAIL PROTE
Community Meeting Notes
(oops--sent this out last night, but forgot to change the sender to the subscribed-to-nanog address first, gomennasai minnasan) Matt I took some notes at the NANOG community meeting tonight, and thought I'd share them with the list members in the spirit of transparency--apologies for the typos that may still exist, I'm heading to the social now. :) Matt 2005.10.23 Steering Committee Randy Bush, IIJ, Tokyo current chair of the steering committee Steering committee progress/status Randy Program committee report (steve) financial report (betty) mailing list report (chris) Other than vetted presentations, each speaks as individual. Microphone is open throughout. How do we progress forward? Steering committee report what was asked for? what has been done? what should or will be done? what does the community want? REferences: Dan Golding's "Why REfport NaNOG" httpH;//ww.nanog.org/mtg-0501/pdf/golding.pdff [EMAIL PROTECTED] mailing list archives dissucions withing comjunity charter is up Mailing list problem: contining problems witn NANOG mailing list administration solutions: you asked for even moderation, transparency, fairness, clue So far: we now have a mailing list amdin group, drawn from voluneers from the community, tasked with moderating the list Need: documented process, appeal, ... ie transparency Philip Glass from Cisco, Bill Norton, Billo Mailing list commitee: Rob Seastrom, Steven Gibbard, Susan Harris, Chris Malayter, Mailing ilist issues SC process re ML comittee not in charter terms selection prolicy and process review and approval ML policy and process: document, get c ommunity feedback and modify ML appeals preoss: to SC Program comittee joe abley, bert russ, bill norton pete templeton? problem: percieved as being out of touch with the general operator community, community powerless solution: empower community to select content so far: OPC transition from 100% merit select o sec selecte 8 from exsiting PC, 8 from comminity nominations with fixe gterms so far: clearly identify PC members (nametags) engage them: orange blobs on their badges! To do: Steve will give PC reprort PC selection by SC Call for volunteers 12 re-ups and 18 new PC/SC call to discuss new volunters sub-SC formed of SC members who were not also PC members SC and sub-SC took input from public and PC sub-SC met and decided the eight to keep Full SC met to decide which eight new volunteers to add Too many good candidates! 2006 PC returnign bill woodcock chris morrow dave oleary hank kimler joe abley kevin epperson steve feldman ted seely new danield goldin jenniver rexford jel gaeggli josh snowohrn pete templon Charter No proposals recived for this year formalizing of ML for next year General Problem: no tranpsarency in the way NANOG is run Solution: segmenting problem and externalizing tranaparent processes Steering comiitee is ultmately accountable to you --we own this one so far: SC selected PC so far: Merit/NANOG financial statements available @NANOG SC and PC have private butdirecty accesible mailing list [EMAIL PROTECTED] and [EMAIL PROTECTED] specifically for community interaction with these two groups SC minutes are publically archived Still to Do First time through this process (new PC, SC working with Merit, etc) trial and error Mailing list moderation still a challenge need documentation and community discussion, exploration of policy and proceedures BLOG /Wiki/SlashNOG Trial of new tech gear at NANOG Adminstrative mailing lists engineering and ops discussion only [EMAIL PROTECTED] open meta discussion [EMAIL PROTECTED] steering committee: [EMAIL PROTECTED] program committee: [EMAIL PROTECTED] ml admins [EMAIL PROTECTED] Discussion? Why do people come to NANOG? What is it that brings us together? Meet face to face with people we interact with electronically, find out new trends, new developments, find out where the internet is going around the world. Find out how to do her job more effectively. Would like longer breaks--15 minutes is really tough for syncing up with people. USENIX facesavers--put faces to names would probably be a nice thing to add on. Good to get operational information *off the record*, not just from Bring back the lunches!!! box lunches would be easy, and would let people meet and greet. Requires less time, allows for better interaction. Good opportunity to meet with peers, people we want to peer with, or people we need to smack for bad peering. :D What about a calendar online for while we're at NANOG, so people can schedule time to meet. Susan Harris talks about lunch challenges--box lunches would be better, less of a logistics challenge than getting sitdown space. Steve Wilcox suggests that maybe trying to cut down on the evening talks to allow more time to talk to people. Mike Hughes points out it's hard to have a program that fits for 400 people as well as allows for smaller gatherings. He's also a wG chair for RIPE, and had troubl