Re: NetUtils / StringUtils / Tokenizer
Nathaniel Alfred wrote: The 30 lines are already in an available JAR, only normalize() is currently private there: http://cvs.apache.org/viewcvs.cgi/avalon-excalibur/sourceresolve/src/jav a/org/apache/excalibur/source/SourceUtil.java?rev=1.5&view=auto Hmmm, interesting ... nowadays you can find an implementation under every turned stone, it seems ;-). We might compare the different implementations, see which is better and keep only that. Just to be picky, the one in Excalibur creates maybe too many temp String objects: path = path.substring(0, i + 1) + path.substring(i + 3); and scans the input too many times for my taste. Anyway, all of this will become moot the day we decide not to support JDK 1.3 anymore. Ugo
Re: NetUtils / StringUtils / Tokenizer
Antonio Gallardo wrote: There are other places where we can take advantage of this jar. If so, please go on and change them to make use of the library, but remember that current users of our code may rely on some undocumented behaviour (like, for instance, never returning null from NetUtils.normalize). It would be fine also, in order to verify that the assumptions still hold, to keep the current testcases, even if we would be effectively testing others' code. In the end, I'm not sure it's worth it, particularly after having seen of Commons' "peculiarities". Ugo
RE: NetUtils / StringUtils / Tokenizer
>Re: code reuse. I'm quite eager too reuse code, usually, but I'm -1 on >adding another JAR (and another dependency) only for 30 lines of code. > > Ugo The 30 lines are already in an available JAR, only normalize() is currently private there: http://cvs.apache.org/viewcvs.cgi/avalon-excalibur/sourceresolve/src/jav a/org/apache/excalibur/source/SourceUtil.java?rev=1.5&view=auto HTH, Alfred. This message is for the named person's use only. It may contain confidential, proprietary or legally privileged information. No confidentiality or privilege is waived or lost by any mistransmission. If you receive this message in error, please notify the sender urgently and then immediately delete the message and any copies of it from your system. Please also immediately destroy any hardcopies of the message. You must not, directly or indirectly, use, disclose, distribute, print, or copy any part of this message if you are not the intended recipient. The sender's company reserves the right to monitor all e-mail communications through their networks. Any views expressed in this message are those of the individual sender, except where the message states otherwise and the sender is authorised to state them to be the views of the sender's company.
Re: NetUtils / StringUtils / Tokenizer
Ugo Cei dijo: > Re: code reuse. I'm quite eager too reuse code, usually, but I'm -1 on > adding another JAR (and another dependency) only for 30 lines of code. There are other places where we can take advantage of this jar. Best Regards, Antonio Gallardo
Re: NetUtils / StringUtils / Tokenizer
Il giorno 30/apr/04, alle 02:11, Joerg Heinicke ha scritto: Without claiming to read Ugo's mind I think that's exactly the point. Too many ".."s means that there are more directories to walk up then it is possible. /../ is only the most simple sample. /foo/../../ is another one. And FilenameUtils.normalize()'s javadoc says clearly: "Returns null if the ..'s went past the root." And that's bizarre. Exactly. Why put the burden of checking the return value of the function on the caller? It's too damn easy to forget and get an NPE. Re: code reuse. I'm quite eager too reuse code, usually, but I'm -1 on adding another JAR (and another dependency) only for 30 lines of code. Ugo
Re: NetUtils / StringUtils / Tokenizer
On 30.04.2004 02:04, Antonio Gallardo wrote: This method returns null if there are too many ".."s, which strikes me as rather bizarre. Can you explain more about that? /../ --> null That's what Ugo called 'bizarre'. That is not the point. Even me, can clearly understand that. 8-| I did not query it :-P The above sentence talk about ''too many ".."s.'' That is what I asked for. Without claiming to read Ugo's mind I think that's exactly the point. Too many ".."s means that there are more directories to walk up then it is possible. /../ is only the most simple sample. /foo/../../ is another one. And FilenameUtils.normalize()'s javadoc says clearly: "Returns null if the ..'s went past the root." And that's bizarre. Joerg
Re: NetUtils / StringUtils / Tokenizer
Joerg Heinicke dijo: > On 30.04.2004 01:19, Antonio Gallardo wrote: > Yep. I found it :-D http://jakarta.apache.org/commons/io/apidocs/org/apache/commons/io/ FilenameUtils.html#normalize(java.lang.String) >>> >>>This method returns null if there are too many ".."s, which strikes me >>>as rather bizarre. >> >> >> Can you explain more about that? > > /../ --> null > > That's what Ugo called 'bizarre'. That is not the point. Even me, can clearly understand that. 8-| The above sentence talk about ''too many ".."s.'' That is what I asked for. From the javadoc of commons IO the method can manage the simple /../ that is all. If not. would be fine to post on the common list about the problem. I am +1 on sharing as much code as we can to avoid the reinvention of the wheel. Another point is if the function meets the Cocoon's requirements. Best Regards, Antonio Gallardo. > > Joerg >
Re: NetUtils / StringUtils / Tokenizer
On 30.04.2004 01:19, Antonio Gallardo wrote: Yep. I found it :-D http://jakarta.apache.org/commons/io/apidocs/org/apache/commons/io/ FilenameUtils.html#normalize(java.lang.String) This method returns null if there are too many ".."s, which strikes me as rather bizarre. Can you explain more about that? /../ --> null That's what Ugo called 'bizarre'. Joerg
Re: NetUtils / StringUtils / Tokenizer
Ugo Cei dijo: > Il giorno 29/apr/04, alle 23:53, Antonio Gallardo ha scritto: > >> Yep. I found it :-D >> http://jakarta.apache.org/commons/io/apidocs/org/apache/commons/io/ >> FilenameUtils.html#normalize(java.lang.String) >> > > This method returns null if there are too many ".."s, which strikes me > as rather bizarre. Can you explain more about that? Best Regards, Antonio Gallardo
Re: NetUtils / StringUtils / Tokenizer
On 29.04.2004 23:36, Joerg Heinicke wrote: "/../" results in "/../" instead of "/" and I don't know ... http://www.ietf.org/rfc/rfc2396.txt 5.2. Resolving Relative References to Absolute Form 6) a) All but the last segment of the base URI's path component is copied to the buffer. In other words, any characters after the last (right-most) slash character, if any, are excluded. b) The reference's path component is appended to the buffer string. c) All occurrences of "./", where "." is a complete path segment, are removed from the buffer string. d) If the buffer string ends with "." as a complete path segment, that "." is removed. e) All occurrences of "/../", where is a complete path segment not equal to "..", are removed from the buffer string. Removal of these path segments is performed iteratively, removing the leftmost matching pattern on each iteration, until no matching pattern remains. f) If the buffer string ends with "/..", where is a complete path segment not equal to "..", that "/.." is removed. g) If the resulting buffer string still begins with one or more complete path segments of "..", then the reference is considered to be in error. Implementations may handle this error by retaining these components in the resolved path (i.e., treating them as part of the final URI), by removing them from the resolved path (i.e., discarding relative levels above the root), or by avoiding traversal of the reference. h) The remaining buffer string is the reference URI's new path component. g) means it's an error, but may be handled as /../ or / - jakarta.commons.io.FilenameUtils' null in the sense of notifying an error is also correct. Hmm ... Joerg
Re: NetUtils / StringUtils / Tokenizer
Il giorno 29/apr/04, alle 23:53, Antonio Gallardo ha scritto: Yep. I found it :-D http://jakarta.apache.org/commons/io/apidocs/org/apache/commons/io/ FilenameUtils.html#normalize(java.lang.String) This meethod returns null if there are too many ".."s, which strikes me as rather bizarre. Ugo
Re: NetUtils / StringUtils / Tokenizer
Il giorno 29/apr/04, alle 23:36, Joerg Heinicke ha scritto: The mentioned class java.net.URI#normalize() in JDK 1.4 behaves different for two cases: "foo/bar1/bar2/bar3/../../.." results in "foo/" instead of "foo" and that's absolutely correct. "/../" results in "/../" instead of "/" and I don't know ... OK, can you please add some more test cases? I'll try to make the code behave just the same. By the way, what does "/../../" normalize to? Ugo
Re: NetUtils / StringUtils / Tokenizer
Antonio Gallardo dijo: > Joerg Heinicke dijo: >> On 29.04.2004 22:12, Ugo Cei wrote: >> >>> I've just committed a new version that: >>> >>> - passes all tests >>> - uses only JDK core classes >>> - has less lines of code than the previous one >>> >>> Hope you all like it. >> >> Very good, but ... first the good news: >> >> The build fails now if a test fails. >> And there is a new target for debugging tests what I find really >> helpful. >> >> Now the bad news: >> >> The mentioned class java.net.URI#normalize() in JDK 1.4 behaves >> different for two cases: >> >> "foo/bar1/bar2/bar3/../../.." results in "foo/" instead of "foo" and >> that's absolutely correct. >> >> "/../" results in "/../" instead of "/" and I don't know ... > > Hi: > > Wonder if there is not already a function for these task in an Apache > commons jar. Would be fine to look for it. Yep. I found it :-D http://jakarta.apache.org/commons/io/apidocs/org/apache/commons/io/FilenameUtils.html#normalize(java.lang.String) Best Regards, Antonio Gallardo
Re: NetUtils / StringUtils / Tokenizer
Joerg Heinicke dijo: > On 29.04.2004 22:12, Ugo Cei wrote: > >> I've just committed a new version that: >> >> - passes all tests >> - uses only JDK core classes >> - has less lines of code than the previous one >> >> Hope you all like it. > > Very good, but ... first the good news: > > The build fails now if a test fails. > And there is a new target for debugging tests what I find really helpful. > > Now the bad news: > > The mentioned class java.net.URI#normalize() in JDK 1.4 behaves > different for two cases: > > "foo/bar1/bar2/bar3/../../.." results in "foo/" instead of "foo" and > that's absolutely correct. > > "/../" results in "/../" instead of "/" and I don't know ... Hi: Wonder if there is not already a function for these task in an Apache commons jar. Would be fine to look for it. Best Regards, Antonio Gallardo > > Joerg >
Re: NetUtils / StringUtils / Tokenizer
On 29.04.2004 22:12, Ugo Cei wrote: I've just committed a new version that: - passes all tests - uses only JDK core classes - has less lines of code than the previous one Hope you all like it. Very good, but ... first the good news: The build fails now if a test fails. And there is a new target for debugging tests what I find really helpful. Now the bad news: The mentioned class java.net.URI#normalize() in JDK 1.4 behaves different for two cases: "foo/bar1/bar2/bar3/../../.." results in "foo/" instead of "foo" and that's absolutely correct. "/../" results in "/../" instead of "/" and I don't know ... Joerg
Re: NetUtils / StringUtils / Tokenizer
I've just committed a new version that: - passes all tests - uses only JDK core classes - has less lines of code than the previous one Hope you all like it. Ugo
Re: NetUtils / StringUtils / Tokenizer
Ugo Cei wrote: Joerg Heinicke wrote: 3. while the Tokenizer correctly works on "/../" the NetUtils.normalize() method has a problem with it: What is the correct output for "/../"? I'd say "/", but the testcase expects "/../". I would consider /../ an error. What does new File("/../foo") do? Upayavira
Re: NetUtils / StringUtils / Tokenizer
Ugo Cei cbim.it> writes: > > 3. while the Tokenizer correctly works on "/../" the > > NetUtils.normalize() method has a problem with it: > > What is the correct output for "/../"? I'd say "/", but the testcase > expects "/../". Good question. I did not thought much about it when adding it as testcase, but I think you are right with "/". I guess there are a few RFCs telling the correct way. Googling I found an URI class in JDK 1.4: http://java.sun.com/j2se/1.4.2/docs/api/java/net/URI.html. It has almost all our NetUtils methods and claims RFC compatibility. So the best is probably to implement the same behaviour, if not even copying the algorithm. Is this allowed to provide this functionality to older JDKs? Joerg
Re: NetUtils / StringUtils / Tokenizer
Joerg Heinicke wrote: 3. while the Tokenizer correctly works on "/../" the NetUtils.normalize() method has a problem with it: What is the correct output for "/../"? I'd say "/", but the testcase expects "/../". Ugo
Re: NetUtils / StringUtils / Tokenizer
Bruno Dumon wrote: My advice: in the 2.1 series, leave that code as is, especially as we're close to a release. It's too easy to brake it. The current code does not pass all the tests. Since we have a testcase, I suggest (and volunteer for) doing the simplest thing that could possibly work and pass all the tests. It should be a handful of code lines, so there's no point using regexps or something from Excalibur (the less we depend on it, the better). JFDI ;-). Ugo
Re: NetUtils / StringUtils / Tokenizer
On Thu, 2004-04-29 at 14:44, Stefano Mazzocchi wrote: > Joerg Heinicke wrote: > > After Cheche's bug report I had a look at NetUtils. Though Ugo fixed it > > with a quick fix, it does not really solve the problem as a test with > > the updated NetUtilsTestCase and the NetUtils before my latest commit > > can easily show. > > > > My problem is that there are many problems in different places in the code: > > > > 1. o.a.commons.lang.StringUtils.split() behaves strange in that way that > > o.a.com.l.SU.split("", "/") return an empty String[] > > while > > o.a.coc.u.SU.split("", "/") returns String[] {""} which is what I would > > expect. > > Also > > o.a.com.l.SU.split("/", "/") returns empty String[]. > > This means the iteration over the string array can never really work > > with o.a.commons.lang.StringUtils. > > > > => reverted that change. > > > > > > 2. our Tokenizer behaves strange in one case: > > > > o.a.coc.u.SU.split("/", "/") > > => new Tokenizer("/", "/", false) > > => tokenizer.countTokens() = 2 > > => but tokenizer.hasMoreTokens() returns true only once > > => leads to String[] {"", null} while it should be String[] {"", ""} > > > > > > 3. while the Tokenizer correctly works on "/../" the > > NetUtils.normalize() method has a problem with it: > > > > result of split/tokenize: String[] {"", "..", ""} > > result of normalize(): "" > > What about using ORO regexp for these things? When I first wrote I was > not really much into regular expressions, but now I do think it makes > sense to use them, these guys spent a lot of time making sure those > things don't happen. There are lots of similar utility functions in the class SourceUtil of the excalibur sourceresolver. There are probably all kinds of subtle differences though. I remember changing the code doing the URL parsing there using a regexp, but this gave stackoverflowexceptions on long URL's (I think it was using jakarta-regexp), and many people considered using a regexp for such critical code not a good idea. My advice: in the 2.1 series, leave that code as is, especially as we're close to a release. It's too easy to brake it. -- Bruno Dumon http://outerthought.org/ Outerthought - Open Source, Java & XML Competence Support Center [EMAIL PROTECTED] [EMAIL PROTECTED]
Re: NetUtils / StringUtils / Tokenizer
Joerg Heinicke wrote: After Cheche's bug report I had a look at NetUtils. Though Ugo fixed it with a quick fix, it does not really solve the problem as a test with the updated NetUtilsTestCase and the NetUtils before my latest commit can easily show. My problem is that there are many problems in different places in the code: 1. o.a.commons.lang.StringUtils.split() behaves strange in that way that o.a.com.l.SU.split("", "/") return an empty String[] while o.a.coc.u.SU.split("", "/") returns String[] {""} which is what I would expect. Also o.a.com.l.SU.split("/", "/") returns empty String[]. This means the iteration over the string array can never really work with o.a.commons.lang.StringUtils. => reverted that change. 2. our Tokenizer behaves strange in one case: o.a.coc.u.SU.split("/", "/") => new Tokenizer("/", "/", false) => tokenizer.countTokens() = 2 => but tokenizer.hasMoreTokens() returns true only once => leads to String[] {"", null} while it should be String[] {"", ""} 3. while the Tokenizer correctly works on "/../" the NetUtils.normalize() method has a problem with it: result of split/tokenize: String[] {"", "..", ""} result of normalize(): "" What about using ORO regexp for these things? When I first wrote I was not really much into regular expressions, but now I do think it makes sense to use them, these guys spent a lot of time making sure those things don't happen. -- Stefano. smime.p7s Description: S/MIME Cryptographic Signature
Re: NetUtils / StringUtils / Tokenizer
Le 29 avr. 04, à 11:52, Upayavira a écrit : ...Doesn't it replace, for example /cocoon/src/java/../webapp with /cocoon/src/webapp? i.e. removing .. from paths? FWIW, I have attached the results of find src -type f -name *.java | xargs -n200 grep 'NetUtils.*normalize' -Bertrand results of find src -type f -name *.java | xargs -n200 grep 'NetUtils.*normalize' src/blocks/portal/java/org/apache/cocoon/environment/portlet/ActionResponse.java: absLoc = NetUtils.normalize(absLoc); src/blocks/scratchpad/java/org/apache/cocoon/acting/LinkTranslatorMapAction.java: path = NetUtils.normalize(path); src/blocks/scratchpad/java/org/apache/cocoon/ant/CocoonCrawling.java: String normalizedUriType = NetUtils.normalize(uriType.getUri()); src/blocks/scratchpad/java/org/apache/cocoon/ant/UriType.java:final String normalizedUri = NetUtils.normalize(uri); src/blocks/scratchpad/java/org/apache/cocoon/ant/UriType.java:this.uri = NetUtils.normalize(absolutizedUri); src/blocks/scratchpad/java/org/apache/cocoon/ant/UriType.java:final String normalizedUri = NetUtils.normalize(uri); src/java/org/apache/cocoon/bean/Target.java:sourceURI = NetUtils.normalize(root + sourceURI); src/java/org/apache/cocoon/bean/Target.java: NetUtils.normalize(NetUtils.absolutize(this.getPath(), linkURI)); src/test/org/apache/cocoon/util/test/NetUtilsTestCase.java: * A unit test for NetUtils.normalize() src/test/org/apache/cocoon/util/test/NetUtilsTestCase.java:String result = NetUtils.normalize(test);
Re: NetUtils / StringUtils / Tokenizer
Il giorno 29/apr/04, alle 11:52, Upayavira ha scritto: Frankly, I don't even know what NetUtils.normalize is supposed to do, so I cannot be of any help here. Doesn't it replace, for example /cocoon/src/java/../webapp with /cocoon/src/webapp? i.e. removing .. from paths? Ah, I see. It would have been nice if the Javadoc actually stated that, wouldn't it? OK, I won't promise anything, but I'll try to refactor it into something clearer and satisfying the testcases. Ugo
Re: NetUtils / StringUtils / Tokenizer
Ugo Cei wrote: Il giorno 29/apr/04, alle 02:48, Joerg Heinicke ha scritto: After Cheche's bug report I had a look at NetUtils. Though Ugo fixed it with a quick fix, it does not really solve the problem as a test with the updated NetUtilsTestCase and the NetUtils before my latest commit can easily show. Frankly, I don't even know what NetUtils.normalize is supposed to do, so I cannot be of any help here. Doesn't it replace, for example /cocoon/src/java/../webapp with /cocoon/src/webapp? i.e. removing .. from paths? Regards, Upayavira
Re: NetUtils / StringUtils / Tokenizer
Il giorno 29/apr/04, alle 02:48, Joerg Heinicke ha scritto: After Cheche's bug report I had a look at NetUtils. Though Ugo fixed it with a quick fix, it does not really solve the problem as a test with the updated NetUtilsTestCase and the NetUtils before my latest commit can easily show. Frankly, I don't even know what NetUtils.normalize is supposed to do, so I cannot be of any help here. Ugo