[ 
https://issues.apache.org/jira/browse/NUTCH-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964069#comment-13964069
 ] 

Alex McLintock commented on NUTCH-1748:
---------------------------------------

FYI

"The similarity to unix and other disk operating system filename conventions 
should be taken as purely coincidental, and should not be taken to indicate 
that URIs should be interpreted as file names."
 quote from http://www.w3.org/Addressing/URL/4_URI_Recommentations.html

That page also says 

The slash ("/", ASCII 2F hex) character is reserved for the delimiting of 
substrings whose relationship is hierarchical. This enables partial forms of 
the URI. Substrings consisting of single or double dots ("." or "..") are 
similarly reserved.

So if we assume that a substring is something which has to be delimited then 
"/../" is NOT  allowed, but ".." surrounded by one or more other characters 
should be. 


> Despite Unix systems accept files containing two dots.Urlfilter-validator 
> rejects such path names.
> --------------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-1748
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1748
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 2.2.1
>            Reporter: Sertac TURKEL
>            Priority: Minor
>             Fix For: 2.3
>
>
> Unix systems accept files containing two dots "abc..xyz.txt". So
> urlfilter-validator should not  reject this kind of urls. Also paths 
> containing "/../" or "/.." in final position should be still rejected.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to