> There sure are some bizarre things in CNs... is there any chance you could
> implement what Marcus Ranum calls "artificial stupidity" in your anomaly-
> detection, create a filter that accepts all standard DN component styles and
> then kick out certs that don't pass the filter?

that's essentially what I did yesterday tho I refined the regex a bit this morning. Here's the (perl) regex..

  /[Cc][Nn]=(?!([\-\w\*]+\.)+[\-\w]+)/

..which appears to also work fine in NEdit. What it does (my intent anyway, I am not an awesome regex master) is recognize any CN-ID pattern that is /not/ syntatically a dot-separated LDH (letter digit hyphen) DNS domain name.

Employed thus..

while (<>) {
  if ( /[Cc][Nn]=(?!([\-\w\*]+\.)+[\-\w]+)/ ) {
    print "\n"; print;
  };
};

I think it does what you suggest below. What I fed into the above loop is a txt file output by querying the raw database table for all contacted domains and returned cert subject values (one pair per line, would of course work on a file with just subjects in it).

>  In other words instead of
> trying to create a regex to detect all the bizarre things that turn up in
> there, create one to pass normal DNs and treat everything that doesn't pass as
> an anomaly?

yes, done.

So there appear to (actually) be 433 (I'd mis-counted yesterday (haste makes waste)) such subject DNs (622 is an incorrect count AFAICT today).

Note that these subject DNs may /also/ contain syntactically correct CN-ID values (many do, some do not).


> It'd be interesting to see what sort of stuff is floating around
> out there...

of course. Note that the data is nominally available upon request as Ivan has blogged, which I posted here earlier here..

[certid] fyi: Ivan Ristic / Qualsys SSL Labs release raw data from the Internet SSL survey
http://www.ietf.org/mail-archive/web/certid/current/msg00484.html

There's a shrink-wrap EULA that Ivan wants folks to agree to before obtaining the data and I haven't groveled thru it enough to tell whether it's legit to just post all 433 results (or any other full result set) to the a public list & archive such as this.

Also note that he/qualsys may re-run the survey periodically. This is also the intention of the EFF folks and their "TLS/SSL observatory" <https://www.eff.org/observatory> (tho they have yet to publicly release their data). (note also that their survey data collection methodologies differ -- EFF folks say they "NMAPed Internet for hosts listening on tcp 443", where Ivan queried publicly registered domain names in a (large) subset of all TLDs)

HTH,

=JeffH


_______________________________________________
certid mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/certid

Reply via email to