Re: [racket-dev] consistency in names and signatures

2012-04-04 Thread ozzloy-racket-dev
along the consistency in function naming vein:
file-name-from-path versus filename-extension.  is filename 1 word or 2?
 i prefer 1.

even more tangential, why isn't file-name-from-path path-filename
instead?  or even basename?

On Thu, Mar 29, 2012 at 08:07, David T. Pierson d...@mindstory.com wrote:

 On Thu, Mar 29, 2012 at 12:44:35AM -0400, David T. Pierson wrote:
  (Presumably if equally concise names that better reflected function
  signatures were available, they would have been used in the first
  place.)

 Sorry for the double post.  I should have added equally lucid along
 with equally concise.

 Perhaps what I should have asked was simply whether there exist names
 that better indicate function signatures but are still good in all or
 most other important aspects, and whether it is worth breaking
 compatibility for such.

 David
 _
  Racket Developers list:
  http://lists.racket-lang.org/dev

_
  Racket Developers list:
  http://lists.racket-lang.org/dev


Re: [racket-dev] consistency in names and signatures

2012-03-28 Thread Andy Gocke
Putting aside the 8 (yeah, really) ways to report errors in Haskell, this
is the option provided by the Maybe (data Maybe a = Something a | Nothing).
While I see many benefits to this approach, I think contracts may provide a
new way out. In most typed languages the major constraint seems to be that
types are checked at compile time and limited in complexity. Contracts,
however, have practically unlimited power. The trick is that one would have
to incorporate some way to reify the calculation checked by the contract.
If you could get a hold of the value produced by contract checking then no
duplicate computation would have to occur inside the function. This is
especially relevant for functions like string-number because the most
obvious implementation checks validity during parsing -- checking the
validity and parsing basically duplicate the function.

Also, if one sees a sort of inner function post-contract-check and an
outer function including the contract check then one could even term the
inner function as total and the outer function as partial.

The advantages of this approach, as far as I can see, are that it removes
large amounts of failure checking and it encourages large amounts of
precondition code to move into the contract. Not only is this good for
documentation and interfaces, but it helps with my current project (random
testing through contracts).

2c

Cheers,
Andy

On Tue, Mar 27, 2012 at 5:23 PM, Matthias Felleisen matth...@ccs.neu.eduwrote:


 Bug report 12652 reminded me of a topic that I brought up a while back,
 that I tried to incorporate into the Style Guide, and that I forgot to
 re-introduce here.

 Background: a lot of people think that consistency in naming,
 signature/contract, and functionality (for methods and functions) is a key
 element to successful software projects. If you saw Yaron Minsky's talk at
 POPL or if you are in a department where he delivered his OCAML is great
 for trading talk, you know what I mean. He formulates this point well, and
 he gives good examples.

 Topic: In our world, we have functions such as

  string-number
  string-path
  string-url
  bytes-string/utf-8

 The naming consistency is good, but they aren't really consistent at the
 signature or functionality level:

  string-number produces #f when called on hello world or \0
  string-path fails on \0
  string-url succeeds on \0 and produces a url

 I consider this less than desirable. I understand arguments for #f and
 exceptional behavior in an ML-style world. In a Racket/Lisp style world, I
 see the behavior of string-number as ideal. I get two behaviors in one
 function:

  (1) parsing in the spirit of formal languages (is this 'string' accepted
 by this 'machine')
  (2) translation in the case of success.

 One advantage of such total functions is of course that they are
 performant. The signatures/contracts are simple and their functionality is
 easy to figure out. A disadvantage is that they deepen our dependence on
 occurrence typing, but so what.

 I could also understand providing two versions of the function:

  string-path : String - Path u False
  string-path/exn: String - Path | effect: exn:fail:contract?

 Q: Would it be worth our while to comb through the libraries and make the
 world consistent, even breaking backwards compatibility? I would be willing
 to run such a project.

 -- Matthias




_
  Racket Developers list:
  http://lists.racket-lang.org/dev


Re: [racket-dev] consistency in names and signatures

2012-03-28 Thread Eli Barzilay
Yesterday, Andy Gocke wrote:
 [...]  This is especially relevant for functions like string-number
 because the most obvious implementation checks validity during
 parsing -- checking the validity and parsing basically duplicate the
 function.

And that makes most of my point.

The thing is that `string-url' is basically *just* a parser -- it
does very little after matching the regexp.  I therefore view the
commit as adding a contract to, say, `read-xml', where the contract
runs the function to see that the input is valid.

An even more extreme example would be `get-pure-port': if you really
want a complete specification of the domain in a contract, then the
contract should make sure that the server is reachable, and that it
returns a valid page.  Combine this with parsing the page, and how
this is not really a great way to run code (ATM!) should be clear.
Besides the issue of doing a bunch of work twice, the contract would
still be broken since having a valid server and/or a page now doesn't
mean that it's going to be valid on the next attempt.  To make this
practical, you'd need some way to expose values that are computed
as part of the contract.  (Reify feels wrong to me in this
context...)

That's why I added the above ATM.  There is an obvious appeal in
doing this -- having all error handling in specific pieces of code and
floating them upwards sounds tempting *if* there's some way to do it
right.  I suspect that such an exposure of the contract results is
just one small step in getting this.  I'm also not sure that it's
doable in a way that actually leads to a practical benefit.  This is
similar to me doubting the theoretical utility in running a parser
twice: on one level you get your guaranteed, nicely total function,
but on the level of providing that guarantee, you get the original
problem.  (And in terms that I'm used to, this is switching the same
work to your well-formedness goal, and that buys nothing in terms of
getting things done.)

IMO, this problem is fundamental enough that it shows up in many
contexts.  One of them is already visible in the `string-url'
example.  The new documentation reads:

  | url-regexp : regexp?
  |
  |   This is a regular expression based on the one in Appendix B of
  |   RFC 3986 for recognizing urls.  This is the precise regexp:
  |
  |   ^(?:([^:/?#]*):)?(?://(?:([^/?#@]*)@)?([^/?#:]*)?(?::([0-9]*))?)
  |   ?([^?#]*)(?:\?([^#]*))?(?:#(.*))?$

(Pre-disclaimer: the following is not said in a negative way.)

At least in my view, this documentation is is useless.  It's true that
it's precise, but as a user of this code, I get nothing out of it.  I
can't even *use* that regexp (the one quoted in the docs) since it
looks like something that can easily change, so I better use the
`url-regexp' binding and not the quoted regexp.

But the deeper reason that this is not useful to me is that it
essentially spells out the parser code -- and documenting a function
using its own code is (IMO) often a sign that the abstraction is
questionable.


But there are a few additional problems with this change that I see:

* Beyond quoting it in the documentation, exposing the regexp means
  that it becomes part of the interface.  This means that I now cannot
  re-implement the code in any way other than matching a regexp.

* It is still partial.  For example, this

- (string-url 1:/)
; Invalid URL string; bad scheme 1: 1:/

  is still not a contract error.  (And I can't see an obvious way to
  add it to the regexp, maybe with some lookahead tricks.)

  Another example is the host part, which is not even checked, but
  this is just sloppiness (= deferring it to network errors that will
  happen with malformed hosts).  And BTW, doing that means that the
  contract becomes platform dependent:

- (file-url-path-convention-type 'unix)
- (url-host (string-url file://x:x/baz))
x
- (file-url-path-convention-type 'windows)
- (url-host (string-url file://x:x/baz))


* More importantly, and possibly related to the first bullet, it
  stands in the way of improving this code.  There is a major problem
  in the design of the code -- it parses all urls as `http'.  A proper
  way to deal with it is to choose a specific parser based on the
  schema.  For example, as it looks now, I can't change it to properly
  treat mailto:...; urls.

  That's not theoretical -- I planned on doing that extension, and now
  it is impossible to do it in a nice way.

-- 
  ((lambda (x) (x x)) (lambda (x) (x x)))  Eli Barzilay:
http://barzilay.org/   Maze is Life!
_
  Racket Developers list:
  http://lists.racket-lang.org/dev


[racket-dev] consistency in names and signatures

2012-03-27 Thread Matthias Felleisen

Bug report 12652 reminded me of a topic that I brought up a while back, that I 
tried to incorporate into the Style Guide, and that I forgot to re-introduce 
here. 

Background: a lot of people think that consistency in naming, 
signature/contract, and functionality (for methods and functions) is a key 
element to successful software projects. If you saw Yaron Minsky's talk at POPL 
or if you are in a department where he delivered his OCAML is great for trading 
talk, you know what I mean. He formulates this point well, and he gives good 
examples. 

Topic: In our world, we have functions such as 

 string-number 
 string-path 
 string-url 
 bytes-string/utf-8 

The naming consistency is good, but they aren't really consistent at the 
signature or functionality level: 

 string-number produces #f when called on hello world or \0
 string-path fails on \0
 string-url succeeds on \0 and produces a url 

I consider this less than desirable. I understand arguments for #f and 
exceptional behavior in an ML-style world. In a Racket/Lisp style world, I see 
the behavior of string-number as ideal. I get two behaviors in one function: 

 (1) parsing in the spirit of formal languages (is this 'string' accepted by 
this 'machine') 
 (2) translation in the case of success. 

One advantage of such total functions is of course that they are performant. 
The signatures/contracts are simple and their functionality is easy to figure 
out. A disadvantage is that they deepen our dependence on occurrence typing, 
but so what. 

I could also understand providing two versions of the function: 
 
  string-path : String - Path u False 
  string-path/exn: String - Path | effect: exn:fail:contract? 

Q: Would it be worth our while to comb through the libraries and make the world 
consistent, even breaking backwards compatibility? I would be willing to run 
such a project. 

-- Matthias




_
  Racket Developers list:
  http://lists.racket-lang.org/dev


Re: [racket-dev] consistency in names and signatures

2012-03-27 Thread Neil Van Dyke

FWIW...

* I have no strong opinion on whether it would be worthwhile, if done in 
a backward-compatible way.


* If done in a *non*-backward-compatible way, it might be a headache.  I 
know of systems in production with millions of lines of PLT/Racket code, 
and -- although PLT/Racket have been pretty good about backward 
compatibility -- it seems like every little non-backward-compatible 
change to a PLT/Racket version, my big clients feel it significantly.  I 
make a little money every time a platform change inflicts pain, since I 
have to fix it, but it's a net loss for me when goodwill for the 
platform is eroded.  (And perhaps eroded goodwill for me, who is 
implicitly endorsing the platform, and who has sometimes been asked 
directly to explain *why* such-and-such changed happened.  I would 
rather be paid to invent and build new stuff, not be responding to the 
platform breaking.)


* I am sympathetic to the idea of being more explicit about types in 
identifiers.  In nontrivial code, I do sometimes end an identifier as 
-or-false or -or-f, and sometimes I have /error or /exn variants 
of procedures.  It helps me keep track of whether the value can be #f.  
I usually avoid being this explicit in identifiers in APIs, because it's 
a little ugly-looking, it has not been idiomatic Racket thus far, and it 
hasn't seemed necessary.


* If we're going to have exception-raising and #f-producing variants of 
a procedure, how about accommodating both the little language and big 
language people by having *three* variants: /exn and /f (or 
/false) for the big language people who want to be explicit, and 
no-suffix for the littler language people who don't need or want all 
that clutter.


* Would this new world of naming conventions be a good time to replace 
the somewhat clunky-looking - naming convention with  or something 
else?  numberstring? number-to-string?  number-as-string?  (No 
non-ASCII, unless I can get an APL keyboard for my ThinkPad.)


* Maybe we should consider otherwise simplifying some of these 
identifiers.  To use an example, bytes-string/utf-8 is already a 
mouthful for a pretty common thing, even before we start adding suffixes 
onto it.  (bytes-string/utf-8 might be too easy an example, since 
UTF-8 encoding would be an appropriate default for a bytes-string 
nowadays, and consistent with Racket's current behavior when writing a 
string to a bytes port.)


* Will there be more consistency in how / in an identifier should be 
read?  It seems that X/Y sometimes reads as X with behavior Y, 
sometimes as X with a Y argument, sometimes as X or Y, and sometimes 
as something else.


Neil V.

--
http://www.neilvandyke.org/

_
 Racket Developers list:
 http://lists.racket-lang.org/dev