[ https://issues.apache.org/jira/browse/NUTCH-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964069#comment-13964069 ]
Alex McLintock commented on NUTCH-1748: --------------------------------------- FYI "The similarity to unix and other disk operating system filename conventions should be taken as purely coincidental, and should not be taken to indicate that URIs should be interpreted as file names." quote from http://www.w3.org/Addressing/URL/4_URI_Recommentations.html That page also says The slash ("/", ASCII 2F hex) character is reserved for the delimiting of substrings whose relationship is hierarchical. This enables partial forms of the URI. Substrings consisting of single or double dots ("." or "..") are similarly reserved. So if we assume that a substring is something which has to be delimited then "/../" is NOT allowed, but ".." surrounded by one or more other characters should be. > Despite Unix systems accept files containing two dots.Urlfilter-validator > rejects such path names. > -------------------------------------------------------------------------------------------------- > > Key: NUTCH-1748 > URL: https://issues.apache.org/jira/browse/NUTCH-1748 > Project: Nutch > Issue Type: Bug > Affects Versions: 2.2.1 > Reporter: Sertac TURKEL > Priority: Minor > Fix For: 2.3 > > > Unix systems accept files containing two dots "abc..xyz.txt". So > urlfilter-validator should not reject this kind of urls. Also paths > containing "/../" or "/.." in final position should be still rejected. -- This message was sent by Atlassian JIRA (v6.2#6252)