Re: [racket-dev] consistency in names and signatures
along the consistency in function naming vein: file-name-from-path versus filename-extension. is filename 1 word or 2? i prefer 1. even more tangential, why isn't file-name-from-path path-filename instead? or even basename? On Thu, Mar 29, 2012 at 08:07, David T. Pierson d...@mindstory.com wrote: On Thu, Mar 29, 2012 at 12:44:35AM -0400, David T. Pierson wrote: (Presumably if equally concise names that better reflected function signatures were available, they would have been used in the first place.) Sorry for the double post. I should have added equally lucid along with equally concise. Perhaps what I should have asked was simply whether there exist names that better indicate function signatures but are still good in all or most other important aspects, and whether it is worth breaking compatibility for such. David _ Racket Developers list: http://lists.racket-lang.org/dev _ Racket Developers list: http://lists.racket-lang.org/dev
Re: [racket-dev] consistency in names and signatures
Putting aside the 8 (yeah, really) ways to report errors in Haskell, this is the option provided by the Maybe (data Maybe a = Something a | Nothing). While I see many benefits to this approach, I think contracts may provide a new way out. In most typed languages the major constraint seems to be that types are checked at compile time and limited in complexity. Contracts, however, have practically unlimited power. The trick is that one would have to incorporate some way to reify the calculation checked by the contract. If you could get a hold of the value produced by contract checking then no duplicate computation would have to occur inside the function. This is especially relevant for functions like string-number because the most obvious implementation checks validity during parsing -- checking the validity and parsing basically duplicate the function. Also, if one sees a sort of inner function post-contract-check and an outer function including the contract check then one could even term the inner function as total and the outer function as partial. The advantages of this approach, as far as I can see, are that it removes large amounts of failure checking and it encourages large amounts of precondition code to move into the contract. Not only is this good for documentation and interfaces, but it helps with my current project (random testing through contracts). 2c Cheers, Andy On Tue, Mar 27, 2012 at 5:23 PM, Matthias Felleisen matth...@ccs.neu.eduwrote: Bug report 12652 reminded me of a topic that I brought up a while back, that I tried to incorporate into the Style Guide, and that I forgot to re-introduce here. Background: a lot of people think that consistency in naming, signature/contract, and functionality (for methods and functions) is a key element to successful software projects. If you saw Yaron Minsky's talk at POPL or if you are in a department where he delivered his OCAML is great for trading talk, you know what I mean. He formulates this point well, and he gives good examples. Topic: In our world, we have functions such as string-number string-path string-url bytes-string/utf-8 The naming consistency is good, but they aren't really consistent at the signature or functionality level: string-number produces #f when called on hello world or \0 string-path fails on \0 string-url succeeds on \0 and produces a url I consider this less than desirable. I understand arguments for #f and exceptional behavior in an ML-style world. In a Racket/Lisp style world, I see the behavior of string-number as ideal. I get two behaviors in one function: (1) parsing in the spirit of formal languages (is this 'string' accepted by this 'machine') (2) translation in the case of success. One advantage of such total functions is of course that they are performant. The signatures/contracts are simple and their functionality is easy to figure out. A disadvantage is that they deepen our dependence on occurrence typing, but so what. I could also understand providing two versions of the function: string-path : String - Path u False string-path/exn: String - Path | effect: exn:fail:contract? Q: Would it be worth our while to comb through the libraries and make the world consistent, even breaking backwards compatibility? I would be willing to run such a project. -- Matthias _ Racket Developers list: http://lists.racket-lang.org/dev
Re: [racket-dev] consistency in names and signatures
Yesterday, Andy Gocke wrote: [...] This is especially relevant for functions like string-number because the most obvious implementation checks validity during parsing -- checking the validity and parsing basically duplicate the function. And that makes most of my point. The thing is that `string-url' is basically *just* a parser -- it does very little after matching the regexp. I therefore view the commit as adding a contract to, say, `read-xml', where the contract runs the function to see that the input is valid. An even more extreme example would be `get-pure-port': if you really want a complete specification of the domain in a contract, then the contract should make sure that the server is reachable, and that it returns a valid page. Combine this with parsing the page, and how this is not really a great way to run code (ATM!) should be clear. Besides the issue of doing a bunch of work twice, the contract would still be broken since having a valid server and/or a page now doesn't mean that it's going to be valid on the next attempt. To make this practical, you'd need some way to expose values that are computed as part of the contract. (Reify feels wrong to me in this context...) That's why I added the above ATM. There is an obvious appeal in doing this -- having all error handling in specific pieces of code and floating them upwards sounds tempting *if* there's some way to do it right. I suspect that such an exposure of the contract results is just one small step in getting this. I'm also not sure that it's doable in a way that actually leads to a practical benefit. This is similar to me doubting the theoretical utility in running a parser twice: on one level you get your guaranteed, nicely total function, but on the level of providing that guarantee, you get the original problem. (And in terms that I'm used to, this is switching the same work to your well-formedness goal, and that buys nothing in terms of getting things done.) IMO, this problem is fundamental enough that it shows up in many contexts. One of them is already visible in the `string-url' example. The new documentation reads: | url-regexp : regexp? | | This is a regular expression based on the one in Appendix B of | RFC 3986 for recognizing urls. This is the precise regexp: | | ^(?:([^:/?#]*):)?(?://(?:([^/?#@]*)@)?([^/?#:]*)?(?::([0-9]*))?) | ?([^?#]*)(?:\?([^#]*))?(?:#(.*))?$ (Pre-disclaimer: the following is not said in a negative way.) At least in my view, this documentation is is useless. It's true that it's precise, but as a user of this code, I get nothing out of it. I can't even *use* that regexp (the one quoted in the docs) since it looks like something that can easily change, so I better use the `url-regexp' binding and not the quoted regexp. But the deeper reason that this is not useful to me is that it essentially spells out the parser code -- and documenting a function using its own code is (IMO) often a sign that the abstraction is questionable. But there are a few additional problems with this change that I see: * Beyond quoting it in the documentation, exposing the regexp means that it becomes part of the interface. This means that I now cannot re-implement the code in any way other than matching a regexp. * It is still partial. For example, this - (string-url 1:/) ; Invalid URL string; bad scheme 1: 1:/ is still not a contract error. (And I can't see an obvious way to add it to the regexp, maybe with some lookahead tricks.) Another example is the host part, which is not even checked, but this is just sloppiness (= deferring it to network errors that will happen with malformed hosts). And BTW, doing that means that the contract becomes platform dependent: - (file-url-path-convention-type 'unix) - (url-host (string-url file://x:x/baz)) x - (file-url-path-convention-type 'windows) - (url-host (string-url file://x:x/baz)) * More importantly, and possibly related to the first bullet, it stands in the way of improving this code. There is a major problem in the design of the code -- it parses all urls as `http'. A proper way to deal with it is to choose a specific parser based on the schema. For example, as it looks now, I can't change it to properly treat mailto:...; urls. That's not theoretical -- I planned on doing that extension, and now it is impossible to do it in a nice way. -- ((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay: http://barzilay.org/ Maze is Life! _ Racket Developers list: http://lists.racket-lang.org/dev
[racket-dev] consistency in names and signatures
Bug report 12652 reminded me of a topic that I brought up a while back, that I tried to incorporate into the Style Guide, and that I forgot to re-introduce here. Background: a lot of people think that consistency in naming, signature/contract, and functionality (for methods and functions) is a key element to successful software projects. If you saw Yaron Minsky's talk at POPL or if you are in a department where he delivered his OCAML is great for trading talk, you know what I mean. He formulates this point well, and he gives good examples. Topic: In our world, we have functions such as string-number string-path string-url bytes-string/utf-8 The naming consistency is good, but they aren't really consistent at the signature or functionality level: string-number produces #f when called on hello world or \0 string-path fails on \0 string-url succeeds on \0 and produces a url I consider this less than desirable. I understand arguments for #f and exceptional behavior in an ML-style world. In a Racket/Lisp style world, I see the behavior of string-number as ideal. I get two behaviors in one function: (1) parsing in the spirit of formal languages (is this 'string' accepted by this 'machine') (2) translation in the case of success. One advantage of such total functions is of course that they are performant. The signatures/contracts are simple and their functionality is easy to figure out. A disadvantage is that they deepen our dependence on occurrence typing, but so what. I could also understand providing two versions of the function: string-path : String - Path u False string-path/exn: String - Path | effect: exn:fail:contract? Q: Would it be worth our while to comb through the libraries and make the world consistent, even breaking backwards compatibility? I would be willing to run such a project. -- Matthias _ Racket Developers list: http://lists.racket-lang.org/dev
Re: [racket-dev] consistency in names and signatures
FWIW... * I have no strong opinion on whether it would be worthwhile, if done in a backward-compatible way. * If done in a *non*-backward-compatible way, it might be a headache. I know of systems in production with millions of lines of PLT/Racket code, and -- although PLT/Racket have been pretty good about backward compatibility -- it seems like every little non-backward-compatible change to a PLT/Racket version, my big clients feel it significantly. I make a little money every time a platform change inflicts pain, since I have to fix it, but it's a net loss for me when goodwill for the platform is eroded. (And perhaps eroded goodwill for me, who is implicitly endorsing the platform, and who has sometimes been asked directly to explain *why* such-and-such changed happened. I would rather be paid to invent and build new stuff, not be responding to the platform breaking.) * I am sympathetic to the idea of being more explicit about types in identifiers. In nontrivial code, I do sometimes end an identifier as -or-false or -or-f, and sometimes I have /error or /exn variants of procedures. It helps me keep track of whether the value can be #f. I usually avoid being this explicit in identifiers in APIs, because it's a little ugly-looking, it has not been idiomatic Racket thus far, and it hasn't seemed necessary. * If we're going to have exception-raising and #f-producing variants of a procedure, how about accommodating both the little language and big language people by having *three* variants: /exn and /f (or /false) for the big language people who want to be explicit, and no-suffix for the littler language people who don't need or want all that clutter. * Would this new world of naming conventions be a good time to replace the somewhat clunky-looking - naming convention with or something else? numberstring? number-to-string? number-as-string? (No non-ASCII, unless I can get an APL keyboard for my ThinkPad.) * Maybe we should consider otherwise simplifying some of these identifiers. To use an example, bytes-string/utf-8 is already a mouthful for a pretty common thing, even before we start adding suffixes onto it. (bytes-string/utf-8 might be too easy an example, since UTF-8 encoding would be an appropriate default for a bytes-string nowadays, and consistent with Racket's current behavior when writing a string to a bytes port.) * Will there be more consistency in how / in an identifier should be read? It seems that X/Y sometimes reads as X with behavior Y, sometimes as X with a Y argument, sometimes as X or Y, and sometimes as something else. Neil V. -- http://www.neilvandyke.org/ _ Racket Developers list: http://lists.racket-lang.org/dev