On Sun, 2009-04-05 at 23:26 +0200, Thorsten Scherler wrote: > On Sun, 2009-04-05 at 22:01 +0200, Thorsten Scherler wrote: > > On Sun, 2009-04-05 at 00:44 +0100, Robin Howlett wrote: > > >... > > > The path will always be converted to /bar.html and is checked against the > > > Rules in rules and wildcardRules but won't be found. However, basepath > > > (which > > > will now be /foo) is never checked against the Rules, therefore giving an > > > incorrect true result for the isUrlAllowed method, no? > > > > Hmm, see above, I disagree but have not debug yet. will do that now. > > > > I just tried and you are right. The norobot code is original coming from > the hc project. Will have a look now whether the bug is original in > there or not.
http://svn.apache.org/viewvc/incubator/droids/branch/preIncubator/src/core/java/org/apache/http/norobots/NoRobotClient.java?revision=366650&view=markup I just love svn. ;) So seems it has been always like this. Maybe we are calling it in a way we should not. Let me explain: https://svn.apache.org/repos/asf/incubator/droids/trunk/droids-norobots/src/test/java/org/apache/droids/norobots/TestNorobotsClient.java I said earlier in the thread: "The base path in our example is http://www.example.com." I said this because of https://issues.apache.org/jira/browse/DROIDS-4 "... http://www.robotstxt.org/norobots-rfc.txt (sec 3.1) "...under a standard relative path on the server: "/robots.txt"." > It should be "new URL(base, "/robots.txt");" " Meaning the base should be the root of the server and not http://www.example.com/foo. Can you open an issue so we do not loose track. TIA salu2 -- Thorsten Scherler <thorsten.at.apache.org> Open Source <consulting, training and solutions>
