Re: Uri class and parser
Any word on the review process for this? For now I have been using it in my projects by just putting it in std/net on its own. But I'd obviously like to see it in the standard library some down, as its URI parsing is very valuable. :)
Re: Uri class and parser
Been thinking about this for a while now, but I can't decide which one I should choose. Currently there is a class URI which has an static method (parser) which returns an instance of URI on success. On failure it will return null. I agree with Jens Mueller on the fact that URI should be a struct instead of a class. But then I won't be able to return null anymore so I should throw an exception when an invalid URI has been passed to the constructor. I'm not sure if this is how problems are being handled in phobos. Something entirely else is the CTFE compatibility of URI. At first I though that because a new instance of URI can be created as a const, it would be evaluated on compile time. This is part of how I test CTFE at the moment: const URI uri36 = URI.parse(http://dlang.org/;); assert(uri36.scheme == http); I tried changing 'const' to 'static' but that resulted in an error. (_adSort cannot be interpreted at compile time, because it has no available source code) Now I'm not sure anymore how to test if my code meets the CTFE requirements. I hope someone can shed some light on my problems and help me make a decision.
Re: Uri class and parser
Something entirely else is the CTFE compatibility of URI. At first I though that because a new instance of URI can be created as a const, it would be evaluated on compile time. This is part of how I test CTFE at the moment: const URI uri36 = URI.parse(http://dlang.org/;); assert(uri36.scheme == http); I tried changing 'const' to 'static' but that resulted in an error. (_adSort cannot be interpreted at compile time, because it has no available source code) Now I'm not sure anymore how to test if my code meets the CTFE requirements. To force something to be evaluated at compile time, you can assign it to an enum, like this: enum uri = URI.parse(http://dlang.org/;);
Re: Uri class and parser
On Thursday, 8 November 2012 at 15:32:59 UTC, jerro wrote: Something entirely else is the CTFE compatibility of URI. At first I though that because a new instance of URI can be created as a const, it would be evaluated on compile time. This is part of how I test CTFE at the moment: const URI uri36 = URI.parse(http://dlang.org/;); assert(uri36.scheme == http); I tried changing 'const' to 'static' but that resulted in an error. (_adSort cannot be interpreted at compile time, because it has no available source code) Now I'm not sure anymore how to test if my code meets the CTFE requirements. To force something to be evaluated at compile time, you can assign it to an enum, like this: enum uri = URI.parse(http://dlang.org/;); Thnx. Got myself some new errors ;) It seems that std.string.indexOf() does not work at compile time. Is there a solution or alternative method for this?
Re: Uri class and parser
Thnx. Got myself some new errors ;) It seems that std.string.indexOf() does not work at compile time. Is there a solution or alternative method for this? I guess the proper solution would be to make std.string.indexOf work at compile time. It looks like changing the first if (std.ascii.isASCII(c)) line in std.string.indexOf to if (!__ctfe std.ascii.isASCII(c)) Makes it work at compile time.
Re: Uri class and parser
On Thursday, 8 November 2012 at 17:02:25 UTC, jerro wrote: Thnx. Got myself some new errors ;) It seems that std.string.indexOf() does not work at compile time. Is there a solution or alternative method for this? I guess the proper solution would be to make std.string.indexOf work at compile time. It looks like changing the first if (std.ascii.isASCII(c)) line in std.string.indexOf to if (!__ctfe std.ascii.isASCII(c)) Makes it work at compile time. After trying your solution I found out I was calling indexOf(string, char) which apparently is different than indexOf(string, string) as I now no longer have that error. Instead, when I call parse on compile time I get the following at the method parse: Error: URI class literals cannot be returned from CTFE The method returns an instance of class URI and works perfectly when called at runtime. As far as I can see it has nothing to do with my previous problem. I do thank you for your answer.
Re: Uri class and parser
After trying your solution I found out I was calling indexOf(string, char) which apparently is different than indexOf(string, string) as I now no longer have that error. Instead, when I call parse on compile time I get the following at the method parse: Error: URI class literals cannot be returned from CTFE It looks like you can't have class enums. This fails too: class A{} enum a = new A; I don't think you can get around this, but you can still test if your URI class works at compile time by doing something like this: auto foo() { // Write code that uses the URI class here // return something that can be assigned to an enum return something; } enum bar = foo();
Re: Uri class and parser
On Friday, November 09, 2012 00:42:42 jerro wrote: After trying your solution I found out I was calling indexOf(string, char) which apparently is different than indexOf(string, string) as I now no longer have that error. Instead, when I call parse on compile time I get the following at the method parse: Error: URI class literals cannot be returned from CTFE It looks like you can't have class enums. This fails too: class A{} enum a = new A; I don't think you can get around this Nope. To some extent, classes can be used at compile time, but they can't persist from compile time to runtime. So, if you really want a type which is useable with CTFE, make it a struct, not a class. - Jonathan M Davis
Re: Uri class and parser
On Thursday, 8 November 2012 at 23:51:47 UTC, Jonathan M Davis wrote: On Friday, November 09, 2012 00:42:42 jerro wrote: After trying your solution I found out I was calling indexOf(string, char) which apparently is different than indexOf(string, string) as I now no longer have that error. Instead, when I call parse on compile time I get the following at the method parse: Error: URI class literals cannot be returned from CTFE It looks like you can't have class enums. This fails too: class A{} enum a = new A; I don't think you can get around this Nope. To some extent, classes can be used at compile time, but they can't persist from compile time to runtime. So, if you really want a type which is useable with CTFE, make it a struct, not a class. - Jonathan M Davis Then I shall make it a struct. But is the following acceptable in phobos? On Thursday, 8 November 2012 at 15:10:18 UTC, Mike van Dongen wrote: I agree with Jens Mueller on the fact that URI should be a struct instead of a class. But then I won't be able to return null anymore so I should throw an exception when an invalid URI has been passed to the constructor.
Re: Uri class and parser
On Friday, November 09, 2012 01:16:54 Mike van Dongen wrote: Then I shall make it a struct. But is the following acceptable in phobos? On Thursday, 8 November 2012 at 15:10:18 UTC, Mike van Dongen wrote: I agree with Jens Mueller on the fact that URI should be a struct instead of a class. But then I won't be able to return null anymore so I should throw an exception when an invalid URI has been passed to the constructor.s Sure. I see no problem with throwing an exception when a constructor is given invalid data. std.datetime does that with its types. - Jonathan M Davis
Re: Uri class and parser
Walter Bright wrote: On 10/26/2012 2:13 AM, Jonathan M Davis wrote: There's definitely some truth to that, but Walter in particular seems to be against breaking anything period. We have a number of fed up developers and abandoned, dead projects because of breaking changes. It MUST STOP. Can you point me to the data that you base you conclusions on? I wonder whether each breaking change can be considered equal. And what's the plan to improve Phobos then? Just have two modules? Jens
Re: Uri class and parser
On Sunday, October 28, 2012 11:32:14 Jens Mueller wrote: Walter Bright wrote: On 10/26/2012 2:13 AM, Jonathan M Davis wrote: There's definitely some truth to that, but Walter in particular seems to be against breaking anything period. We have a number of fed up developers and abandoned, dead projects because of breaking changes. It MUST STOP. Can you point me to the data that you base you conclusions on? I don't know everything that he's basing it on, but we've definitely had complaints in this newsgroup in the past over breaking changes (and it wouldn't surprise me if the folks in this newsgroup are more willing to put up with breaking changes than most D users). Even changes which are very popular can generate complaints about the code breakage that they cause. When the library is younger, it's not such a big deal, but as it matures, it becomes a very big deal. And at this point, we seem to be mature enough that we're moving away from the stage where it's okay to break stuff without a very good reason. Certainly, if we don't reach there soon, then it's going to be much harder for D to catch on. We'd like to get to the point that we don't break anything ever (or at least don't break anything without a major version change of some kind). I wonder whether each breaking change can be considered equal. I think that you can talk Walter into breaking changes if doing so is very beneficial (e.g. we're looking at removing opEquals, opCmp, toHash, and toString from Object because doing so would have huge benefits with regards to const-correctness among other things), but something as simple as renaming a function or module definitely isn't going to sit well with him, pretty much regardless of how much code uses it or how simple a change it is. And what's the plan to improve Phobos then? Just have two modules? In this case, if moving std.curl to std.net.curl isn't deemed worth the breakage that it would cause (and I'd be very suprised if Walter thought that it was), then we'd probably just keep std.curl and not move it at all. And while reorganizing some modules might be desirable, it really doesn't buy us much. I'd strongly argue that if it's not worth putting a module through the deprecation process in order to rename it, then it shouldn't be renamed at all. Having both std.curl and std.net.curl around permanently (even if one is basically an alias of the other), just seems like a bad idea to me. It creates clutter in the library. Best case, we could rename std.curl to std.net.curl, leaving std.curl empty save for the fact that it publicly imports std.net.curl, undocument std.curl completely, and then leave it around as undocumented, but that still creates clutter. It's just less visible. Aside from there being two modules as part of deprecating one and getting code to use the new one (which we really do want to stop doing), then I would only expect us to end up with two modules if we completely revamped a module but didn't want to break code using the old version (e.g. std.xml - std.xml2). - Jonathan m Davis
Re: Uri class and parser
On 2012-10-28 12:39, Jonathan M Davis wrote: I don't know everything that he's basing it on, but we've definitely had complaints in this newsgroup in the past over breaking changes (and it wouldn't surprise me if the folks in this newsgroup are more willing to put up with breaking changes than most D users). Even changes which are very popular can generate complaints about the code breakage that they cause. When the library is younger, it's not such a big deal, but as it matures, it becomes a very big deal. And at this point, we seem to be mature enough that we're moving away from the stage where it's okay to break stuff without a very good reason. Certainly, if we don't reach there soon, then it's going to be much harder for D to catch on. We'd like to get to the point that we don't break anything ever (or at least don't break anything without a major version change of some kind). I think we should be able to break code when there's a good reason for it. The RailsConf 2012 had a great talk called Progress. I think it's really worth seeing: http://www.youtube.com/watch?v=VOFTop3AMZ8 In this case, if moving std.curl to std.net.curl isn't deemed worth the breakage that it would cause (and I'd be very suprised if Walter thought that it was), then we'd probably just keep std.curl and not move it at all. And while reorganizing some modules might be desirable, it really doesn't buy us much. I'd strongly argue that if it's not worth putting a module through the deprecation process in order to rename it, then it shouldn't be renamed at all. Having both std.curl and std.net.curl around permanently (even if one is basically an alias of the other), just seems like a bad idea to me. It creates clutter in the library. Best case, we could rename std.curl to std.net.curl, leaving std.curl empty save for the fact that it publicly imports std.net.curl, undocument std.curl completely, and then leave it around as undocumented, but that still creates clutter. It's just less visible. That's way we need to stop this madness of putting all modules in the std package. BTW, this should have happened five years ago. Aside from there being two modules as part of deprecating one and getting code to use the new one (which we really do want to stop doing), then I would only expect us to end up with two modules if we completely revamped a module but didn't want to break code using the old version (e.g. std.xml - std.xml2). - Jonathan m Davis -- /Jacob Carlborg
Re: Uri class and parser
On 2012-10-26 16:16, Jens Mueller wrote: Probably. I'm all for it. The thing is that new things are proposed but these then die out quickly. I have that feeling e.g. with trello.com even though it's a good idea. I wish I knew why these things do not work out. But there are also decisions that turned out to be very successful, e.g. moving to github. It's all about conniving Walter. He's the one release the new versions. -- /Jacob Carlborg
Re: Uri class and parser
On 2012-10-26 22:20, Walter Bright wrote: n 10/23/2012 1:47 PM, Mike van Dongen wrote: My code can be found here, at the bottom of the already existing file uri.d: https://github.com/MikevanDongen/phobos/blob/uri-parser/std/uri.d Needs to have a range interface, not a string interface. For what, encoding/decoding? -- /Jacob Carlborg
Re: Uri class and parser
On 2012-10-25 23:06, Jens Mueller wrote: I'd prefer the second option. Maybe write first some unittests for std.uri, if there are none. Then move it. Agree, but we want to minimize the code breakage. -- /Jacob Carlborg
Re: Uri class and parser
Jacob Carlborg wrote: On 2012-10-25 23:06, Jens Mueller wrote: I'd prefer the second option. Maybe write first some unittests for std.uri, if there are none. Then move it. Agree, but we want to minimize the code breakage. That's what the unittests are for. Code breakage that results in compiler errors (i.e. using deprecate) are tolerable, I think. Silently code breakage is problematic. Jens
Re: Uri class and parser
On Friday, October 26, 2012 10:11:04 Jens Mueller wrote: Jacob Carlborg wrote: On 2012-10-25 23:06, Jens Mueller wrote: I'd prefer the second option. Maybe write first some unittests for std.uri, if there are none. Then move it. Agree, but we want to minimize the code breakage. That's what the unittests are for. Code breakage that results in compiler errors (i.e. using deprecate) are tolerable, I think. Silently code breakage is problematic. No. The issue is code breakage in the code of people using Phobos, and if you change where the module is, you'll break code. Even if we provide a deprecation path from std.uri to std.net.uri, that still means that people will have to change their code eventually, meaning that you still have code breakage (it's just better controlled). Making the change has to be worth breaking people's code, and making breaking changes to Phobos is becoming less and less acceptable. I don't know whether it is or isn't acceptable in this case. - Jonathan M Davis
Re: Uri class and parser
Jonathan M Davis wrote: On Friday, October 26, 2012 10:11:04 Jens Mueller wrote: Jacob Carlborg wrote: On 2012-10-25 23:06, Jens Mueller wrote: I'd prefer the second option. Maybe write first some unittests for std.uri, if there are none. Then move it. Agree, but we want to minimize the code breakage. That's what the unittests are for. Code breakage that results in compiler errors (i.e. using deprecate) are tolerable, I think. Silently code breakage is problematic. No. The issue is code breakage in the code of people using Phobos, and if you change where the module is, you'll break code. Even if we provide a deprecation path from std.uri to std.net.uri, that still means that people will have to change their code eventually, meaning that you still have code breakage (it's just better controlled). Making the change has to be worth breaking people's code, and making breaking changes to Phobos is becoming less and less acceptable. I don't know whether it is or isn't acceptable in this case. We should add the cost of fixing to the equation. When code is easy to fix I argue it should always be done. In case of std.uri any import of it needs to be fixed, i.e. the fixing cost is very low. I wish we had some estimate how much code is affected. Can't we release the web page stats or something similar? This way we may estimate how much a module is used. Crawling github is also an option. I don't want to be in a situation where we cannot change things because we have to assume that infinite code is broken. Or we provide an community approved way to have different incompatible versions of a module. Andrei suggested the std.mymodule2 approach at some point. It'll be nice if DConf discusses these organizational problems. They are of strategic importance I firmly believe. Jens
Re: Uri class and parser
On Friday, October 26, 2012 10:59:17 Jens Mueller wrote: No. The issue is code breakage in the code of people using Phobos, and if you change where the module is, you'll break code. Even if we provide a deprecation path from std.uri to std.net.uri, that still means that people will have to change their code eventually, meaning that you still have code breakage (it's just better controlled). Making the change has to be worth breaking people's code, and making breaking changes to Phobos is becoming less and less acceptable. I don't know whether it is or isn't acceptable in this case. We should add the cost of fixing to the equation. There's definitely some truth to that, but Walter in particular seems to be against breaking anything period. If it were entirely up to him, pretty much none of the breaking changes that have happened to Phobos' API over the last few years would have happened. And Andrei is beginning to oppose most breaking changes. So, the bar is getting pretty high for making breaking changes. Simply renaming stuff generally isn't going to cut it. This is arguably slightly more than simply renaming std.uri, because it's an issue of module organization rather than simply what its name is, but it's also arguably so trivial that the benefit is near zero. I doubt that std.uri is a particularly heavily used module, but I have no idea how acceptable Walter or Andrei would find it to move it to std.net. In principle, it would be good, but in practice, I don't know. Whether it can happen or not probably comes down primarily to whether you can convince Andrei or not. - Jonathan M Davis
Re: Uri class and parser
Jonathan M Davis wrote: On Friday, October 26, 2012 10:59:17 Jens Mueller wrote: No. The issue is code breakage in the code of people using Phobos, and if you change where the module is, you'll break code. Even if we provide a deprecation path from std.uri to std.net.uri, that still means that people will have to change their code eventually, meaning that you still have code breakage (it's just better controlled). Making the change has to be worth breaking people's code, and making breaking changes to Phobos is becoming less and less acceptable. I don't know whether it is or isn't acceptable in this case. We should add the cost of fixing to the equation. There's definitely some truth to that, but Walter in particular seems to be against breaking anything period. If it were entirely up to him, pretty much none of the breaking changes that have happened to Phobos' API over the last few years would have happened. And Andrei is beginning to oppose most breaking changes. So, the bar is getting pretty high for making breaking changes. Simply renaming stuff generally isn't going to cut it. This is arguably slightly more than simply renaming std.uri, because it's an issue of module organization rather than simply what its name is, but it's also arguably so trivial that the benefit is near zero. I doubt that std.uri is a particularly heavily used module, but I have no idea how acceptable Walter or Andrei would find it to move it to std.net. In principle, it would be good, but in practice, I don't know. Whether it can happen or not probably comes down primarily to whether you can convince Andrei or not. Is it okay to have both modules and only state in std.uri's documentation that you shouldn't use it anymore (similar to std.xml)? This would break no code. Jens
Re: Uri class and parser
On Friday, October 26, 2012 11:27:40 Jens Mueller wrote: Is it okay to have both modules and only state in std.uri's documentation that you shouldn't use it anymore (similar to std.xml)? This would break no code. If we were to move it, we'd temporarily leave a near-empty std.uri which publicly imported std.net.uri. At some later date, we'd then deprecate it, and then later actually remove it. There's a definite cost to leaving such a module around long term, because then you code floating around using both std.uri and std.net.uri. We might try something like removing it from the documentation instead of deprecating it (or at least delay that deprecation for a while), but then you'd have code silently using it without its developers knowing that they needed to change anything. It can certainly be done, and there are things that can be done to mitigate how quickly code is broken, but if you move it, code _will_ be broken at some point, and if you leave what basically amounts to an alias around to it long term, there's a definite cost to that as well. Personally, I wouldn't be against putting std.uri through the deprecation path and move it to std.net.uri, but I'm also almost certainly the person who has broken the most of Phobos' API in an effort to make it consistent (something which Walter has never been happy with). So, I'm not really in a position to approve such a change. At this point, I think that something like that pretty much has to go through Andrei. - Jonathan M Davis
Re: Uri class and parser
Jonathan M Davis wrote: On Friday, October 26, 2012 11:27:40 Jens Mueller wrote: Is it okay to have both modules and only state in std.uri's documentation that you shouldn't use it anymore (similar to std.xml)? This would break no code. If we were to move it, we'd temporarily leave a near-empty std.uri which publicly imported std.net.uri. At some later date, we'd then deprecate it, and then later actually remove it. There's a definite cost to leaving such a module around long term, because then you code floating around using both std.uri and std.net.uri. We might try something like removing it from the documentation instead of deprecating it (or at least delay that deprecation for a while), but then you'd have code silently using it without its developers knowing that they needed to change anything. It can certainly be done, and there are things that can be done to mitigate how quickly code is broken, but if you move it, code _will_ be broken at some point, and if you leave what basically amounts to an alias around to it long term, there's a definite cost to that as well. Personally, I wouldn't be against putting std.uri through the deprecation path and move it to std.net.uri, but I'm also almost certainly the person who has broken the most of Phobos' API in an effort to make it consistent (something which Walter has never been happy with). So, I'm not really in a position to approve such a change. At this point, I think that something like that pretty much has to go through Andrei. It's a difficult compromise that has to be found. That's why I also want more data to make a decision. But I'm actually more towards breaking code if it makes sense. Ideally we should be able to measure the impact that a breaking change will have. deprecate allows something like this. I.e. deprecate it and if you get lots of complaints rethink whether it is worth it. E.g. setAssertHandler used to be deprecated. Now it is undeprecated. Unfortunately there are no portable solutions. The only thing I found is symbol versioning. But this is only supported by GNU ld. These strategic issues need to be solved. Jens
Re: Uri class and parser
On 2012-10-26 11:53, Jens Mueller wrote: Unfortunately there are no portable solutions. The only thing I found is symbol versioning. But this is only supported by GNU ld. These strategic issues need to be solved. There are many other solutions that can be used to easy this problem. The release process can be improved significantly, this has been discussed before. For example, provide a road map. Then people will know that's coming. Better release scheme. Releases of DMD, D and Phobos shouldn't necessarily be in the same release. We use a semantic version scheme: http://semver.org/ This will make it easier for developers when they upgrade DMD. It basically let them know if something will break or not. -- /Jacob Carlborg
Re: Uri class and parser
On 2012-10-26 11:13, Jonathan M Davis wrote: There's definitely some truth to that, but Walter in particular seems to be against breaking anything period. If it were entirely up to him, pretty much none of the breaking changes that have happened to Phobos' API over the last few years would have happened. And Andrei is beginning to oppose most breaking changes. So, the bar is getting pretty high for making breaking changes. Simply renaming stuff generally isn't going to cut it. This is arguably slightly more than simply renaming std.uri, because it's an issue of module organization rather than simply what its name is, but it's also arguably so trivial that the benefit is near zero. I agree that's not good to break code. But I also don't like that we have to suffer for the choices/mistakes made in the past just because of risking breaking code. Example, someone thought it was a great idea to basically not not use the nice module system we do have in D and instead use a flat namespace. -- /Jacob Carlborg
Re: Uri class and parser
Looks good. Does it handle relative URIs? It would also be nice to support combining URIs from an absolute and relative portion. Another omission is handling file URIs.
Re: Uri class and parser
On Friday, 26 October 2012 at 11:52:09 UTC, John Chapman wrote: Looks good. Does it handle relative URIs? It would also be nice to support combining URIs from an absolute and relative portion. Another omission is handling file URIs. It doesn't support relative URIs as their syntax is not officially defined in RFC 3986, nor are there any relative URIs in the examples: http://tools.ietf.org/html/rfc3986#section-1.1.2 I am however considering it because even though not (clearly) defined, they are URIs and they are often used. Definition of URI on Wikipedia: In computing, a uniform resource identifier (URI) is a string of characters used to identify a name or a resource.
Re: Uri class and parser
Jacob Carlborg wrote: On 2012-10-26 11:53, Jens Mueller wrote: Unfortunately there are no portable solutions. The only thing I found is symbol versioning. But this is only supported by GNU ld. These strategic issues need to be solved. There are many other solutions that can be used to easy this problem. The release process can be improved significantly, this has been discussed before. For example, provide a road map. Then people will know that's coming. Better release scheme. Releases of DMD, D and Phobos shouldn't necessarily be in the same release. We use a semantic version scheme: http://semver.org/ This will make it easier for developers when they upgrade DMD. It basically let them know if something will break or not. Probably. I'm all for it. The thing is that new things are proposed but these then die out quickly. I have that feeling e.g. with trello.com even though it's a good idea. I wish I knew why these things do not work out. But there are also decisions that turned out to be very successful, e.g. moving to github. Jens
Re: Uri class and parser
On Friday, 26 October 2012 at 14:13:21 UTC, Mike van Dongen wrote: I am however considering it because even though not (clearly) defined, they are URIs and they are often used. The basedOn function in my uri struct in cgi.d does it: https://github.com/adamdruppe/misc-stuff-including-D-programming-language-web-stuff/blob/master/cgi.d#L1702 The unit test below the function is focused on checking the relative function.
Re: Uri class and parser
On Friday, 26 October 2012 at 14:22:07 UTC, Adam D. Ruppe wrote: On Friday, 26 October 2012 at 14:13:21 UTC, Mike van Dongen wrote: I am however considering it because even though not (clearly) defined, they are URIs and they are often used. The basedOn function in my uri struct in cgi.d does it: https://github.com/adamdruppe/misc-stuff-including-D-programming-language-web-stuff/blob/master/cgi.d#L1702 The unit test below the function is focused on checking the relative function. On Friday, 26 October 2012 at 11:52:09 UTC, John Chapman wrote: It would also be nice to support combining URIs from an absolute and relative portion. If relative URIs will be supported I agree there should be a method that merges two (incomplete) URIs into one (valid) URI.
Re: Uri class and parser
n 10/23/2012 1:47 PM, Mike van Dongen wrote: My code can be found here, at the bottom of the already existing file uri.d: https://github.com/MikevanDongen/phobos/blob/uri-parser/std/uri.d Needs to have a range interface, not a string interface.
Re: Uri class and parser
On 10/26/2012 2:13 AM, Jonathan M Davis wrote: There's definitely some truth to that, but Walter in particular seems to be against breaking anything period. We have a number of fed up developers and abandoned, dead projects because of breaking changes. It MUST STOP.
Re: Uri class and parser
On Friday, 26 October 2012 at 20:20:59 UTC, Walter Bright wrote: Needs to have a range interface, not a string interface. I don't think range makes any sense for this. It isn't something you'd loop over... you just load the one string and slice it up.
Re: Uri class and parser
On 2012-10-24 22:36, Adam D. Ruppe wrote: Yes, that's the term in the standard. http://en.wikipedia.org/wiki/Fragment_identifier Javascript calls it the hash though, but it is slightly different: the # symbol itself is not part of the fragment according to the standard. But javascript's location.hash does return it. URL: example.com/ location.hash location.hash = test test URL changes to: example.com/#test location.hash; #test The fragment would technically just be test there. I've obviously done too much JavaScript :). Thanks for the clarification. -- /Jacob Carlborg
Re: Uri class and parser
On Wednesday, 24 October 2012 at 20:36:51 UTC, Adam D. Ruppe wrote: On Wednesday, 24 October 2012 at 19:54:54 UTC, Jacob Carlborg wrote: A nitpick, I'm not really an expert on URI's but is fragment really the correct name for that I would call the hash? That would be nose in the example below. Yes, that's the term in the standard. http://en.wikipedia.org/wiki/Fragment_identifier The only reason I used fragment was because both the RFC and the Wikipedia page called it that way. I hate to break protocol ;) Cool. It would be nice to have a way to set the query and path as an (associative) array as well. Now it allows you to create/edit an URI. You can do so by using an array or string, whichever you prefer. I also added a toString() method and fixed the indentation to 4 spaces, instead of 1 tab. uri = new Uri(); uri.scheme = foo; uri.username = username; uri.password = password; uri.host = example.com; uri.port = 8042; uri.path = [over, there, index.dtb]; uri.query = [type: animal, name: narwhal, novalue: ]; uri.fragment = nose; assert(uri.toString() == foo://username:passw...@example.com:8042/over/there/index.dtb?novalue=name=narwhaltype=animal#nose);
Re: Uri class and parser
On 2012-10-25 13:24, Mike van Dongen wrote: Now it allows you to create/edit an URI. You can do so by using an array or string, whichever you prefer. I also added a toString() method and fixed the indentation to 4 spaces, instead of 1 tab. uri = new Uri(); uri.scheme = foo; uri.username = username; uri.password = password; uri.host = example.com; uri.port = 8042; uri.path = [over, there, index.dtb]; uri.query = [type: animal, name: narwhal, novalue: ]; uri.fragment = nose; assert(uri.toString() == foo://username:passw...@example.com:8042/over/there/index.dtb?novalue=name=narwhaltype=animal#nose); Awesome, now it's only missing documentation :) -- /Jacob Carlborg
Re: Uri class and parser
On Wednesday, 24 October 2012 at 12:47:15 UTC, Adam D. Ruppe wrote: BTW don't forget that this is legal: ?valuevalue=1value=2 The appropriate type for the AA is string[][string] Thanks for the type! It's now implemented. uri = Uri.parse(http://dlang.org/?valuevalue=1value=2;); assert(uri.queryMulti == [value: [, 1, 2]]); assert(uri.query[value] == 2); On Thursday, 25 October 2012 at 11:44:11 UTC, Jacob Carlborg wrote: Awesome, now it's only missing documentation :) It's a dirty job, but someone's gotta do it ;) I have commented all code that's not straightforward, but nothing for the Ddoc. Can you give me an example of how specific I need to be in the documentation? On Wednesday, 24 October 2012 at 19:54:54 UTC, Jacob Carlborg wrote: On 2012-10-24 20:22, Mike van Dongen wrote: As all my code is part of a single class and the file std/uri.d already existed, I decided to 'just' append my code to the file. Should I perhaps put it in another file as the private methods you mentioned are not relevant to my code? If the some methods aren't used by the URI parser you should remove the. If they're used I would suggested you move the further down in the code, possibly at the bottom. It's not clear to me what you mean by this. To clarify: the first 520 lines weren't written by me, and the code I have written doesn't use any of those functions. Atleast, for now; Moving the functions 'encode' and 'decode' into the class Uri may be useful at a later point. As I'm the new kid on the block, I'm trying not to break others' code. ;)
Re: Uri class and parser
Mike van Dongen wrote: Hi all! I've been working on an URI parser which takes a string and then separates the parts and puts them in the correct properties. If a valid URI was provided, the (static) parser will return an instance of Uri. I've commented all relevant lines of code and tested it using unittests. Now what I'm wondering is if it meets the phobos requirements and standards. And of course if you think I should do a pull request on GitHub! My code can be found here, at the bottom of the already existing file uri.d: https://github.com/MikevanDongen/phobos/blob/uri-parser/std/uri.d Just some small nit-picks. * Rename URIerror to URIException and make it derive from Exception I assume it is allowed to recover from such errors. * URI_Encode = uriEncode in general checkout the Phobos style guide regarding function names etc. http://dlang.org/dstyle.html * Why is Uri a class and not a struct? Do you expect people to derive from Uri? But then you need to check whether URI.init makes sense. According to the Phobos style Uri should be renamed to URI. * Maybe you should add a constructor in addition to URI.parse() to construct some simple URIs. * Check whether it is const correct. Is const used where it makes sense? * You could implement opEquals to allow checking for equal URIs. * Should URIs be only movable? Then you should add @disable this(this); * The code looks sometimes a bit C-ish (gotos) and deeply nested. Maybe some functions could be refactored to ease maintenance. * Regarding documentation. Please state whether it works in CTFE. Probably add some unittests. I hope the module makes it into Phobos. I suggest std.net.uri. Thank you very much for contributing. Jens
Re: Uri class and parser
On Thursday, October 25, 2012 15:59:39 Jens Mueller wrote: * Should URIs be only movable? Then you should add @disable this(this); I'd be shocked if that were appropriate. Disabling the init property like that should be done extremely rarely and only if you need to. It's problematic that that feature exists at all. * Regarding documentation. Please state whether it works in CTFE. Probably add some unittests. If it states that it works in CTFE, it needs unit tests to verify it. I hope the module makes it into Phobos. I suggest std.net.uri. Thank you very much for contributing. We have std.uri already, so presumably, it would go there. - Jonathan M Davis
Re: Uri class and parser
On Thursday, 25 October 2012 at 13:59:59 UTC, Jens Mueller wrote: Just some small nit-picks. That's what I was hoping for :D in general checkout the Phobos style guide regarding function names etc. http://dlang.org/dstyle.html add some unittests. Thanks! I haven't read that page before, but I think my code is already in that style. That page referred to one about code coverage, which I didn't know is an option in dmd. I added some unittests and increased the code coverage to 100%. http://dlang.org/code_coverage.html * Rename URIerror to URIException and make it derive from Exception I assume it is allowed to recover from such errors. * URI_Encode = uriEncode I know it was hard to tell, but that's not my code. * The code looks sometimes a bit C-ish (gotos) and deeply nested. Maybe some functions could be refactored to ease maintenance. I'm not sure what functions you mean. I hope the module makes it into Phobos. I suggest std.net.uri. I agree I should move it, so moved my code into a new file, 'std/net/uri.d'. https://github.com/MikevanDongen/phobos/blob/uri-parser/std/net/uri.d * Why is Uri a class and not a struct? Do you expect people to derive from Uri? But then you need to check whether URI.init makes sense. According to the Phobos style Uri should be renamed to URI. The renaming is done. At the start, the class was quite general in a way I wanted to make others derive from it. That plus the little experience I have with structs in D made me use a class. Now I don't see a reason why anyone would derive from it, so I think it should either be a final class (as Jacob Carlborg suggested) or a struct. * Maybe you should add a constructor in addition to URI.parse() to construct some simple URIs. Do you mean with parameters (strings?) that simply set the properties? * Check whether it is const correct. Is const used where it makes sense? Been thinking about that myself aswell. Both for return values and parameters. * You could implement opEquals to allow checking for equal URIs. I like this suggestion! :) * Should URIs be only movable? Then you should add @disable this(this); http://forum.dlang.org/thread/mohceehplxdhsdllx...@forum.dlang.org I'm not sure what movable means or what @disable does. That thread doesn't explain it means very clear. * Regarding documentation. Please state whether it works in CTFE. Probably I'll look that up ;) I hope the module makes it into Phobos. I suggest std.net.uri. Thank you very much for contributing. Jens Again, thanks for being supportive; that's what makes me continue ;)
Re: Uri class and parser
On 2012-10-25 15:20, Mike van Dongen wrote: It's a dirty job, but someone's gotta do it ;) I have commented all code that's not straightforward, but nothing for the Ddoc. If think the ddoc is the most important one. Can you give me an example of how specific I need to be in the documentation? I would say as specific as possible. But the interface is pretty straight forward. It might be enough to have empty ddoc comments for all the methods that is a part of the URI, i.e. username, port and so on. Then add documentation for the class that shows how to use the complete interface. Take a look the $.mobile.path.parseUrl() method part of jquerymobile: http://jquerymobile.com/demos/1.2.0/docs/api/methods.html It shows an example in the bottom what every method would return. It's not clear to me what you mean by this. To clarify: the first 520 lines weren't written by me, and the code I have written doesn't use any of those functions. Atleast, for now; Moving the functions 'encode' and 'decode' into the class Uri may be useful at a later point. As I'm the new kid on the block, I'm trying not to break others' code. ;) Oh, Phobos already has a uri module, then you should leave the other code. My bad hehe. I noticed just now that you moved your new code to std.net.uri. I don't think it's good to have two uri modules. Either leave the code in std.uri or move all code to std.net.uri and add a public import or similar to std.uri. But that is still risk of breaking existing code. -- /Jacob Carlborg
Re: Uri class and parser
Jonathan M Davis wrote: On Thursday, October 25, 2012 15:59:39 Jens Mueller wrote: * Should URIs be only movable? Then you should add @disable this(this); I'd be shocked if that were appropriate. Disabling the init property like that should be done extremely rarely and only if you need to. It's problematic that that feature exists at all. That does not disable the init property. It does disable copying. * Regarding documentation. Please state whether it works in CTFE. Probably add some unittests. If it states that it works in CTFE, it needs unit tests to verify it. Right. Safer this way. I hope the module makes it into Phobos. I suggest std.net.uri. Thank you very much for contributing. We have std.uri already, so presumably, it would go there. I think it belongs inside std.net. Why not make it compatible to the current std.uri and add deprecated aliases? The std package is way too flat. Jens
Re: Uri class and parser
Jacob Carlborg wrote: On 2012-10-25 15:20, Mike van Dongen wrote: It's a dirty job, but someone's gotta do it ;) I have commented all code that's not straightforward, but nothing for the Ddoc. If think the ddoc is the most important one. Can you give me an example of how specific I need to be in the documentation? I would say as specific as possible. But the interface is pretty straight forward. It might be enough to have empty ddoc comments for all the methods that is a part of the URI, i.e. username, port and so on. Then add documentation for the class that shows how to use the complete interface. Take a look the $.mobile.path.parseUrl() method part of jquerymobile: http://jquerymobile.com/demos/1.2.0/docs/api/methods.html It shows an example in the bottom what every method would return. It's not clear to me what you mean by this. To clarify: the first 520 lines weren't written by me, and the code I have written doesn't use any of those functions. Atleast, for now; Moving the functions 'encode' and 'decode' into the class Uri may be useful at a later point. As I'm the new kid on the block, I'm trying not to break others' code. ;) Oh, Phobos already has a uri module, then you should leave the other code. My bad hehe. I noticed just now that you moved your new code to std.net.uri. I don't think it's good to have two uri modules. Either leave the code in std.uri or move all code to std.net.uri and add a public import or similar to std.uri. But that is still risk of breaking existing code. I'd prefer the second option. Maybe write first some unittests for std.uri, if there are none. Then move it. Jens
Re: Uri class and parser
Mike van Dongen wrote: On Thursday, 25 October 2012 at 13:59:59 UTC, Jens Mueller wrote: Just some small nit-picks. That's what I was hoping for :D in general checkout the Phobos style guide regarding function names etc. http://dlang.org/dstyle.html add some unittests. Thanks! I haven't read that page before, but I think my code is already in that style. That page referred to one about code coverage, which I didn't know is an option in dmd. I added some unittests and increased the code coverage to 100%. http://dlang.org/code_coverage.html Are you sure? I've seen _ in names where camel case should have been used. * Rename URIerror to URIException and make it derive from Exception I assume it is allowed to recover from such errors. * URI_Encode = uriEncode I know it was hard to tell, but that's not my code. You can still change it, can't you? Just want to see the code being improved. * The code looks sometimes a bit C-ish (gotos) and deeply nested. Maybe some functions could be refactored to ease maintenance. I'm not sure what functions you mean. There are functions that I find difficult to understand. Usually the ones with lots of deeply nested control statements. And often they are a bit long. I will read it later again and be more specific. I hope the module makes it into Phobos. I suggest std.net.uri. I agree I should move it, so moved my code into a new file, 'std/net/uri.d'. https://github.com/MikevanDongen/phobos/blob/uri-parser/std/net/uri.d Now we need to properly deprecate the old std.uri without breaking any code. Do you think this is feasible? And are the Phobos core developers okay with this? * Why is Uri a class and not a struct? Do you expect people to derive from Uri? But then you need to check whether URI.init makes sense. According to the Phobos style Uri should be renamed to URI. The renaming is done. At the start, the class was quite general in a way I wanted to make others derive from it. That plus the little experience I have with structs in D made me use a class. Now I don't see a reason why anyone would derive from it, so I think it should either be a final class (as Jacob Carlborg suggested) or a struct. Do you want reference semantics or value semantics? I find value semantics easier. So I would go with a struct. When passing URIs around one can still pass by reference. * Maybe you should add a constructor in addition to URI.parse() to construct some simple URIs. Do you mean with parameters (strings?) that simply set the properties? Yes. At least to support constructing often used URIs. And then maybe another constructor that calls parse for all other cases. * Check whether it is const correct. Is const used where it makes sense? Been thinking about that myself aswell. Both for return values and parameters. It's annoying to do it later but it should be done, I think. And it's not too late. Good luck! * You could implement opEquals to allow checking for equal URIs. I like this suggestion! :) * Should URIs be only movable? Then you should add @disable this(this); http://forum.dlang.org/thread/mohceehplxdhsdllx...@forum.dlang.org I'm not sure what movable means or what @disable does. That thread doesn't explain it means very clear. In my understanding structs are moved by default. That means the members a bit-wise copied. Now assuming you move a pointer, then you may want to copy the contents that the pointer to have an independent object, a copy so to say. This is achieved by using the so called postblit constructor this(this). If you disable it, then the URIs are only movable. If you want copying, then you may need to implement it. http://dlang.org/struct.html#StructPostblit * Regarding documentation. Please state whether it works in CTFE. Probably I'll look that up ;) I hope the module makes it into Phobos. I suggest std.net.uri. Thank you very much for contributing. Jens Again, thanks for being supportive; that's what makes me continue ;) I hope there will be a happy end. We should probably seek some advice from the core Phobos developers whether they want to have an improved std.uri moved to std.net.uri. Don't want you to put work into something that is unlikely to make through the review process. Jens
Re: Uri class and parser
On Thursday, October 25, 2012 23:05:14 Jens Mueller wrote: Jonathan M Davis wrote: On Thursday, October 25, 2012 15:59:39 Jens Mueller wrote: * Should URIs be only movable? Then you should add @disable this(this); I'd be shocked if that were appropriate. Disabling the init property like that should be done extremely rarely and only if you need to. It's problematic that that feature exists at all. That does not disable the init property. It does disable copying. Ah true. I read it too quickly. Regardless, disabling much of _anything_ is something that you shouldn't be doing normally (though disabling init is definitely the worst). It's far, far better to implement the postblit when a shallow copy doesn't do what you want. - Jonathan M Davis
Re: Uri class and parser
Jonathan M Davis wrote: On Thursday, October 25, 2012 23:05:14 Jens Mueller wrote: Jonathan M Davis wrote: On Thursday, October 25, 2012 15:59:39 Jens Mueller wrote: * Should URIs be only movable? Then you should add @disable this(this); I'd be shocked if that were appropriate. Disabling the init property like that should be done extremely rarely and only if you need to. It's problematic that that feature exists at all. That does not disable the init property. It does disable copying. Ah true. I read it too quickly. Regardless, disabling much of _anything_ is something that you shouldn't be doing normally (though disabling init is definitely the worst). It's far, far better to implement the postblit when a shallow copy doesn't do what you want. Yeah. And probably here you want to implement it. Jens
Re: Uri class and parser
On 2012-10-23 22:47, Mike van Dongen wrote: Hi all! I've been working on an URI parser which takes a string and then separates the parts and puts them in the correct properties. If a valid URI was provided, the (static) parser will return an instance of Uri. I've commented all relevant lines of code and tested it using unittests. Now what I'm wondering is if it meets the phobos requirements and standards. And of course if you think I should do a pull request on GitHub! My code can be found here, at the bottom of the already existing file uri.d: https://github.com/MikevanDongen/phobos/blob/uri-parser/std/uri.d Thanks, Mike van Dongen. I would have expected a few additional components, like: * Domain * Password * Username * Host * Hash A way to build an URI base on the components. It would be nice if there were methods for getting/setting the path component as an array. Also methods for getting/setting the query component as an associative array. A few stylistic issues. There are a lot of places where you haven't indented the code, at least how it looks like on github. I wouldn't put the private methods at the top. -- /Jacob Carlborg
Re: Uri class and parser
On Tuesday, 23 October 2012 at 20:47:26 UTC, Mike van Dongen wrote: https://github.com/MikevanDongen/phobos/blob/uri-parser/std/uri.d If you want to take any of the code from mine, feel free. It is struct Uri in my cgi.d: https://github.com/adamdruppe/misc-stuff-including-D-programming-language-web-stuff/blob/master/cgi.d#L1615 My thing includes relative linking and some more parsing too. The ctRegex in there however, when it works it is cool, but if there's an error in an *other* part of the code, other module, doesn't call it, completely unrelated such as just making a typo on a local variable name... the compiler spews like 20 errors about ctRegex. That's annoying. But the bug is in the compiler and only makes other errors uglier so I'm just ignoring it for now.
Re: Uri class and parser
On Wednesday, 24 October 2012 at 07:38:58 UTC, Jacob Carlborg wrote: It would be nice if there were methods for getting/setting the path component as an array. Also methods for getting/setting the query component as an associative array. BTW don't forget that this is legal: ?valuevalue=1value=2 The appropriate type for the AA is string[][string] This is why my cgi.d has functions two decodeVariables and decodeVariablesSingle and two members (in the Cgi class, I didn't add it to the Uri struct) get and getArray. decodeVariables returns the complete string[][string] and the single versions only keep the last element of the string[], which gives a string[string] for convenience.
Re: Uri class and parser
On Wednesday, 24 October 2012 at 07:38:58 UTC, Jacob Carlborg wrote: I would have expected a few additional components, like: * Domain * Password * Username * Host * Hash A way to build an URI base on the components. It would be nice if there were methods for getting/setting the path component as an array. Also methods for getting/setting the query component as an associative array. I have a public domain URI parser here: http://github.com/p0nce/gfm/blob/master/common/uri.d
Re: Uri class and parser
On Wednesday, 24 October 2012 at 07:38:58 UTC, Jacob Carlborg wrote: I would have expected a few additional components, like: * Domain * Password * Username * Host * Hash A way to build an URI base on the components. It would be nice if there were methods for getting/setting the path component as an array. Also methods for getting/setting the query component as an associative array. Thanks for the suggestions! I've added many, if not all, of them to the repo: - Identifying/separating the username, password (together the userinfo), the domain and the port number from the authority. - The hash now also can be get/set and the same thing goes for the data in the query On Wednesday, 24 October 2012 at 12:47:15 UTC, Adam D. Ruppe wrote: On Wednesday, 24 October 2012 at 07:38:58 UTC, Jacob Carlborg wrote: It would be nice if there were methods for getting/setting the path component as an array. Also methods for getting/setting the query component as an associative array. BTW don't forget that this is legal: ?valuevalue=1value=2 The appropriate type for the AA is string[][string] It does not yet take into account the fact that multiple query elements can have the same name. I'll be working on that next. On Wednesday, 24 October 2012 at 07:38:58 UTC, Jacob Carlborg wrote: A few stylistic issues. There are a lot of places where you haven't indented the code, at least how it looks like on github. I wouldn't put the private methods at the top. As for the indentations, I use tabs with the size of 4 spaces. Viewing the code on Github (in Chromium) you'll see tabs of 8 spaces. I'm not sure what the phobos standard is? As all my code is part of a single class and the file std/uri.d already existed, I decided to 'just' append my code to the file. Should I perhaps put it in another file as the private methods you mentioned are not relevant to my code? You may be able to see the new getters by checking out this unittest: uri = Uri.parse(foo://username:passw...@example.com:8042/over/there/index.dtb?type=animalname=narwhalnovalue#nose); assert(uri.scheme == foo); assert(uri.authority == username:passw...@example.com:8042); assert(uri.path == over/there/index.dtb); assert(uri.pathAsArray == [over, there, index.dtb]); assert(uri.query == type=animalname=narwhalnovalue); assert(uri.queryAsArray == [type: animal, name: narwhal, novalue: ]); assert(uri.fragment == nose); assert(uri.host == example.com); assert(uri.port == 8042); assert(uri.username == username); assert(uri.password == password); assert(uri.userinfo == username:password); assert(uri.queryAsArray[type] == animal); assert(uri.queryAsArray[novalue] == ); assert(novalue in uri.queryAsArray); assert(!(nothere in uri.queryAsArray));
Re: Uri class and parser
On 2012-10-24 20:22, Mike van Dongen wrote: Thanks for the suggestions! I've added many, if not all, of them to the repo: - Identifying/separating the username, password (together the userinfo), the domain and the port number from the authority. - The hash now also can be get/set and the same thing goes for the data in the query As for the indentations, I use tabs with the size of 4 spaces. Viewing the code on Github (in Chromium) you'll see tabs of 8 spaces. I'm not sure what the phobos standard is? Ok, I'm using firefox and it doesn't look particular good on github. The Phobos standard is to use tabs as spaces with the size of 4. As all my code is part of a single class and the file std/uri.d already existed, I decided to 'just' append my code to the file. Should I perhaps put it in another file as the private methods you mentioned are not relevant to my code? If the some methods aren't used by the URI parser you should remove the. If they're used I would suggested you move the further down in the code, possibly at the bottom. You may be able to see the new getters by checking out this unittest: Cool. It would be nice to have a way to set the query and path as an (associative) array as well. Just a suggestion, I don't really see a point in having getters and setters that just forwards to the instance variables. Just use public instance variables. The only reason to use getters and setters would be to be able to subclass and override them. But I think you could just make Uri a final class. About path and query. I wonder that's best to be default return an (associative) array or a string. I would think it's more useful to return an (associative) array and then provide rawPath() and rawQuery() which would return strings. A nitpick, I'm not really an expert on URI's but is fragment really the correct name for that I would call the hash? That would be nose in the example below. uri = Uri.parse(foo://username:passw...@example.com:8042/over/there/index.dtb?type=animalname=narwhalnovalue#nose); assert(uri.scheme == foo); assert(uri.authority == username:passw...@example.com:8042); assert(uri.path == over/there/index.dtb); assert(uri.pathAsArray == [over, there, index.dtb]); assert(uri.query == type=animalname=narwhalnovalue); assert(uri.queryAsArray == [type: animal, name: narwhal, novalue: ]); assert(uri.fragment == nose); assert(uri.host == example.com); assert(uri.port == 8042); assert(uri.username == username); assert(uri.password == password); assert(uri.userinfo == username:password); assert(uri.queryAsArray[type] == animal); assert(uri.queryAsArray[novalue] == ); assert(novalue in uri.queryAsArray); assert(!(nothere in uri.queryAsArray)); -- /Jacob Carlborg
Re: Uri class and parser
On Wednesday, 24 October 2012 at 19:54:54 UTC, Jacob Carlborg wrote: A nitpick, I'm not really an expert on URI's but is fragment really the correct name for that I would call the hash? That would be nose in the example below. Yes, that's the term in the standard. http://en.wikipedia.org/wiki/Fragment_identifier Javascript calls it the hash though, but it is slightly different: the # symbol itself is not part of the fragment according to the standard. But javascript's location.hash does return it. URL: example.com/ location.hash location.hash = test test URL changes to: example.com/#test location.hash; #test The fragment would technically just be test there.