Burton,
days ago I have applied (selected parts of your patches). The code seems
to be ok now. If not, please let me know.

Thanks for your work, Luca

"Burton M. Strauss III" wrote:
> 
> (REPOST due to mailing list problems)
> 
> The long awaited (ha ha) phase 2 of my URL Security patch is in the CVS for
> the next snapshot (which will be
> http://snapshot.ntop.org/tgz/ntop-02-03-06.tgz when it's available).
> 
> URLs will now be rejected if they contain any of the following characters:
> %:@\r\n
> 
>      We should probably add the other RFC1945 no-nos to the prohibited
> characters
>      (that is make the code at the start of http.c look like this:
> 
>      /*    This list is derived from RFC1945 in sec 3.2 Uniform Resource
> Identifiers
>            which defines the permitted characters in a URI/URL.
> Specifically, the
>            definitions of
> 
>             reserved       = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+"
>             unsafe         = CTL | SP | <"> | "#" | "%" | "<" | ">"
> 
>           DO NOT put % here - it's special cased, as it's too dangerous to
> handle the same...
>       */
>      #define URL_PROHIBITED_CHARACTERS     "\001\002\003\004\005\006" \
>                                        "\010\011\012\013\014\015\016" \
>                                        "\020\021\022\023\024\025\026" \
>                                        "\030\031\032\033\034\035\036" \
>                                        " \"#&+/:;<=>?@\127"
> 
> Changes:
> 
> As at present, URLs which exhibit path relative behavior (. .. or //) will
> also be rejected.
> 
> Unsupported extensions (other than .htm(l), j, png, gif and css) will be
> rejected.
> 
> Finally, there is also logic to test the FORM of the URL, specifically to
> allow ntop's favorite URLs, which are like these:
> 
>     [0..255].[0..255].[0..255].[0..255].html
>     xxxxxx    (no extension - just an internal name)
>     XXXXX-[0..255].[0..255].[0..255].[0..255].html
> 
> ALL rejections will log like this:
> 
>      URL security(1): ERROR: Found percent in URL...DANGER...rejecting
> request
>      URL security(2): ERROR: Found // in URL...rejecting request
>      URL security(3): ERROR: Found .. in URL...rejecting request
> 
> The #s are unique to the rejecting cause - you will find them un URLsecurity
> function, which is in http.c around line 810.
> 
> I'm more concerned about whether the current implementation causes other
> problems for anyone - rejects good URLs or allows bad ones past.  I'm
> especially interested if anyone is using non-US-ASCII URLs, etc. So please
> fire feedback to me ASAP.
> 
> -----Burton
> 
> PS: Here is the remaining problem.  Here is the BNF:
> 
>        URI            = ( absoluteURI | relativeURI ) [ "#" fragment ]
>        absoluteURI    = scheme ":" *( uchar | reserved )
>        relativeURI    = net_path | abs_path | rel_path
>        net_path       = "//" net_loc [ abs_path ]
>        abs_path       = "/" rel_path
>        rel_path       = [ path ] [ ";" params ] [ "?" query ]
>        path           = fsegment *( "/" segment )
>        fsegment       = 1*pchar
>        segment        = *pchar
>        params         = param *( ";" param )
>        param          = *( pchar | "/" )
>        scheme         = 1*( ALPHA | DIGIT | "+" | "-" | "." )
>        net_loc        = *( pchar | ";" | "?" )
>        query          = *( uchar | reserved )
>        fragment       = *( uchar | reserved )
>        pchar          = uchar | ":" | "@" | "&" | "=" | "+"
>        uchar          = unreserved | escape
>        unreserved     = ALPHA | DIGIT | safe | extra | national
>        escape         = "%" HEX HEX
>        reserved       = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+"
>        extra          = "!" | "*" | "'" | "(" | ")" | ","
>        safe           = "$" | "-" | "_" | "."
>        unsafe         = CTL | SP | <"> | "#" | "%" | "<" | ">"
>        national       = <any OCTET excluding ALPHA, DIGIT,
>                         reserved, extra, safe, and unsafe>
> 
> While TECHNICALLY there are a lot of characters that worry me, the only
> illegal ones per the RFC are reserved and unsafe.  This is because National
> include not just ASCII but accents, etc. - anything defined in the character
> set (even Unicode).  We've already defacto decided to bounce Unicode -
> anyone have thoughts here about high values (\200 and above)???  For
> example, our non-US folks - do you know of any websites with names using
> these???
> 
> TIA!
> 
> -----Burton
> 
> _______________________________________________
> Ntop-dev mailing list
> [EMAIL PROTECTED]
> http://listmanager.unipi.it/mailman/listinfo/ntop-dev

-- 
Luca Deri                     NETikos S.p.A.
Via Matteucci 34/B            56124 Pisa, Italy.
Ph. +39/050/968.639           Fax. +39/050/968.626
Personal: [EMAIL PROTECTED]   Business: [EMAIL PROTECTED]
WWW: http://www.lucaderi.org/ ICQ: 68183632
Hacker: someone who loves to program and enjoys being
clever about it - Richard Stallman
_______________________________________________
Ntop-dev mailing list
[EMAIL PROTECTED]
http://listmanager.unipi.it/mailman/listinfo/ntop-dev

Reply via email to