[ https://issues.apache.org/jira/browse/VALIDATOR-467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ivan Larionov updated VALIDATOR-467: ------------------------------------ Description: {code:java} import org.apache.commons.validator.routines.UrlValidator; ... private static final String[] schemes = {"http", "https"}; private static final UrlValidator urlValidator = new UrlValidator(schemes, UrlValidator.ALLOW_LOCAL_URLS + UrlValidator.ALLOW_2_SLASHES); ... urlValidator.isValid("https://example.com//some_path/path/") {code} This returns {{false}}. However such URL is valid if authority is not {{null}}. The reason it returns {{false}} is this code in the validator: https://github.com/apache/commons-validator/blob/a3771313c9f1833abf32c7c294ad1de4810e532d/src/main/java/org/apache/commons/validator/routines/UrlValidator.java#L452-L461 {code:java} try { URI uri = new URI(null,null,path,null); String norm = uri.normalize().getPath(); if (norm.startsWith("/../") // Trying to go via the parent dir || norm.equals("/..")) { // Trying to go to the parent dir return false; } } catch (URISyntaxException e) { return false; } {code} As far as I understand {{URI uri = new URI(null,null,path,null);}} throws {{URISyntaxException}} if authority is {{null}} and path starts with {{//}}. I tried running {{new URI(null, "example.com", path, null);}} and it worked. I didn't read RFC but from some googling around I got the following: {{//some_path}} is invalid if authority is null {{//some_path}} is valid if authority is not null Update: Another thing I noticed while testing is that the following actually passes the validation – {{"https://example.com//test//double//slashes"}} And that {{"https://example.com//test"}} fails the validation but {{URISyntaxException}} is thrown due to {{Illegal character in hostname}} and not due to {{//}} at the start. So my original theory behind the failure looks incorrect now, however I still consider this bug as a valid one. was: {code:java} import org.apache.commons.validator.routines.UrlValidator; ... private static final String[] schemes = {"http", "https"}; private static final UrlValidator urlValidator = new UrlValidator(schemes, UrlValidator.ALLOW_LOCAL_URLS + UrlValidator.ALLOW_2_SLASHES); ... urlValidator.isValid("https://example.com//some_path/path/") {code} This returns {{false}}. However such URL is valid if authority is not {{null}}. The reason it returns {{false}} is this code in the validator: https://github.com/apache/commons-validator/blob/a3771313c9f1833abf32c7c294ad1de4810e532d/src/main/java/org/apache/commons/validator/routines/UrlValidator.java#L452-L461 {code:java} try { URI uri = new URI(null,null,path,null); String norm = uri.normalize().getPath(); if (norm.startsWith("/../") // Trying to go via the parent dir || norm.equals("/..")) { // Trying to go to the parent dir return false; } } catch (URISyntaxException e) { return false; } {code} As far as I understand {{URI uri = new URI(null,null,path,null);}} throws {{URISyntaxException}} if authority is {{null}} and path starts with {{//}}. I tried running {{new URI(null, "example.com", path, null);}} and it worked. I didn't read RFC but from some googling around I got the following: {{//some_path}} is invalid if authority is null {{//some_path}} is valid if authority is not null > URL validator fails if path starts with double slash > ---------------------------------------------------- > > Key: VALIDATOR-467 > URL: https://issues.apache.org/jira/browse/VALIDATOR-467 > Project: Commons Validator > Issue Type: Bug > Components: Routines > Affects Versions: 1.6 > Reporter: Ivan Larionov > Priority: Major > > {code:java} > import org.apache.commons.validator.routines.UrlValidator; > ... > private static final String[] schemes = {"http", "https"}; > private static final UrlValidator urlValidator = new UrlValidator(schemes, > UrlValidator.ALLOW_LOCAL_URLS + UrlValidator.ALLOW_2_SLASHES); > ... > urlValidator.isValid("https://example.com//some_path/path/") > {code} > This returns {{false}}. However such URL is valid if authority is not > {{null}}. > The reason it returns {{false}} is this code in the validator: > https://github.com/apache/commons-validator/blob/a3771313c9f1833abf32c7c294ad1de4810e532d/src/main/java/org/apache/commons/validator/routines/UrlValidator.java#L452-L461 > {code:java} > try { > URI uri = new URI(null,null,path,null); > String norm = uri.normalize().getPath(); > if (norm.startsWith("/../") // Trying to go via the parent dir > || norm.equals("/..")) { // Trying to go to the parent dir > return false; > } > } catch (URISyntaxException e) { > return false; > } > {code} > As far as I understand {{URI uri = new URI(null,null,path,null);}} throws > {{URISyntaxException}} if authority is {{null}} and path starts with {{//}}. > I tried running {{new URI(null, "example.com", path, null);}} and it worked. > I didn't read RFC but from some googling around I got the following: > {{//some_path}} is invalid if authority is null > {{//some_path}} is valid if authority is not null > Update: > Another thing I noticed while testing is that the following actually passes > the validation – {{"https://example.com//test//double//slashes"}} > And that {{"https://example.com//test"}} fails the validation but > {{URISyntaxException}} is thrown due to {{Illegal character in hostname}} and > not due to {{//}} at the start. > So my original theory behind the failure looks incorrect now, however I still > consider this bug as a valid one. -- This message was sent by Atlassian Jira (v8.3.4#803005)