Re: [Pharo-dev] Empirical Analysis of Programming Language Adoption

Nicolas Cellier Fri, 08 Jun 2018 09:52:36 -0700

2018-06-08 17:06 GMT+02:00 Thomas Dupriez <
thomas.dupr...@ens-paris-saclay.fr>:


> Hello,
> I wanted to just write a quick comment, but it turned into an essay,
> sorry. ^^
>
> Le 08/06/2018 à 16:35, Nicolas Cellier a écrit :
>
>
>
> 2018-06-08 14:50 GMT+02:00 Thierry Goubier <thierry.goub...@gmail.com>:
>
>>
>> Note that this is used in Smalltalk, when you write anInteger, aString
>> : you're using a form of typing for documentation.
>>
>> Exactly!
>
> And if you transpose this style to static typing you get things like
>     Cat *theCat = new Cat;
> Being tainted, I always thought that is was noise...
> You'd better rename your variable felix;)
>
> Naming variables/arguments according to their type is only half a solution
> I think.
> Because there are actually two things I would like to know about a
> variable/argument when reading code: its type and its meaning.
> For example, knowing that an argument named "anInt" is an integer is nice,
> but I would also like to know that it's the number of dice the method has
> to roll. The two information are very useful to quickly understand the code.
>
> From what I've seen in Pharo, variables are usually named after their
> meaning, while method argument are named after their type, but in an ideal
> world I would like to know both the type and the meaning of both the
> variables and the arguments.
>
> That's in my opinion one of the advantages of explicit types for reading
> code: you write the type besides the variable/argument, so the name of the
> variable/argument can describe its meaning and I have both informations.
>
>
As the Dan Luu overview of studies tend to show (
https://danluu.com/empirical-pl/), one outcome is that, yes, types help to
understand the program whatever the language.
But types need not be static declarations. They can be in comments, in
variable names, or other kind of documentation (that's one reason why class
comments matters IMO!).

Where this information lies in Smalltalk depends on the kind of variable.

- For method arguments, semantic is given by accompanying keyword, like
(name: aString).
  Since methods are short (they should be!) this association is mentally
preserved while reading code.
  The problem comes when the argument can be polymorphic, like aValuable.
  My recommendation is to document expectations in method comment if it
can't be inferred simply.
  "aValuable: any object understanding value: (a 1-arg block, a Symbol, an
Association, ...)"

- For instance variables, the name carries a semantic: my personnal
recommendation is that the expected type SHOULD be documented in class
comment.
  Once upon a time (st80 years) classes were quite well documented.
  If it's not the case, you have to browe the inst.var. writers, and thus
you fall back to the last case of temporary vriables below.
  Lacking or incomplete class comments is a criterion of quality...

- For temporary variables, you generally either instantiate a class, use a
literal or send a message to another variable.
  In the later case, the problem is to infer the type of returned object,
which might involve tracking message chains and class comments.
  My experience is that this participates to the flat learning curve for
beginners.
  At the beginning, we spend a lot of time exploring the message flow, but
once the library is familiar, we don't.
  But this process has a virtue: it gives a deeper understanding on how
things work, and IMO it's beneficial in the long term.
  Personnally, it participated to the pleasure of programming in Smalltalk.

Static typing may help the IDE (refactoring and navigating).
> My POV is thus that you enter this information for the tools, not for the
> humans (compiler, navigator, refactoring engine, ...).
>
> I have differently tainted colleagues still thinking that the type help
> them reading code...
> So, IMO, this assertion reflects the dominant culture rather than
> intrinsic merits.
> IOW, if you want to create a successfull language, just clone an existing
> one :(
>
> I would agree with your colleagues. I got stuck countless times when
> reading pharo code, because there was a message send to a variable I didn't
> know the type of, so I couldn't know which method was being called (because
> there were multiple methods with that name in the system). So the only
> solution was to place a breakpoint and get that method to be executed. This
> is not so easy (at least for me) in programs that are not really simple,
> because:
> 1) I need to have this method executed in its "normal use environment". I
> can't just execute it with dummy values as argument, because that will
> affect the values the variables will take.
> 2) I can look into tests, but since I don't know the program, I have no
> idea which tests to run to execute the piece of code.
> So in general, it's either asking someone that knows how the program work,
> or spending a lot of (annoying) time figuring out how the entire program
> flows values to get the type of that one variable.
>
> Thomas
>

My experience is that the difficulty for inferring types comes from
mega-morphic messages, or messages with false polymorphism.
If things are difficult for a type inferencer, it will be difficult for us
too.
The advantage we have over an automated type-inferencer is if we can infer
the type from the semantic, but that indeed requires prior knowledge of the
library in use...

My conclusion in the case you describe is that you have dealed with complex
libraries (maybe with very abstract objects) that would hence deserve more
documentation.
The problem is worse if you are still in the learning curve for the base
Smalltalk libraries (Collection, Stream, Number, ...).

I will remind that one drawback of having static type declarations is to
severly limit the evolution of the library.
For example, once upon a time we had:
  Collection>>collect: aBlock
But we then hacked:
  Symbol>>value: anObject
      ^anObject perform: self
enabling things like (points collect: #x).

If you have static declarations, you can't use such hack untill you rewrite
all the type information in the whole Collection library and that's
impracticle.

Note that we did not take time to rewrite all the type hint information
like this
  Collection>>collect: aBlockOrSymbol
or with a neologism:
  Collection>>collect: aValuable
The neologism would help only if properly documented, because it's not a
class name, but rather an expectation that the object provides a certain
API.

The information that we can use a Symbol because (partly) polymorphic with
BlockClosure is completely implicit.
This has to be learned from tutorials or from reading code examples.
One important quality in such case for the library is to have the most
uniform rules possible for those conventions.

Re: [Pharo-dev] Empirical Analysis of Programming Language Adoption

Reply via email to