Re: studies of naming?

Richard O'Keefe Tue, 27 Mar 2012 20:12:48 -0700

On 27/03/2012, at 10:14 PM, Steven Clarke wrote:

> Yes, you're right Richard. Our study was focused on languages like Java and 
> C# (and importantly, as they existed around 2006/2007). So as you say, we 
> wouldn't generalize the results to languages that have named parameters. We 
> described the two design choices we evaluated at the start of the paper:
> 
> "There are two common design choices: provide only
> constructors that require certain objects (a "required
> constructor"). This option has the benefit of enforcing
> certain invariants at the expense of flexibility. An
> alternative design, "create-set-call," allows objects to
> be created and then initialized."


There are actually three design choices, and I've seen the third
one too often for comfort.  Please note that this is a *different*
choice from the one you endorsed:

Good Smalltalk design:

   Provide a variety of factory methods with keyword arguments,
   all of which provide fully initialised objects satisfying
   the class invariant; such objects often need little or no
   mutation afterwards.

   Good Eiffel design agrees in every respect except 'keyword arguments'.

Good create-set-call design:

   Ensure that the default 'new' constructor returns a fully
   initialised object satisfying the class invariant and
   offering meaningful default behaviour; such objects almost always
   require adjustment to get them into the state you really want but
   all states are meaningful.

   IN ADDITION make sure that all 'initialisation phase' methods
   can safely be called AT ALL TIMES.

   Ensure that every public method is *tested* with a default-
   initialised object.

Bad create-set-call design:

   Don't think about class invariants.
   Rely on default constructors that leave fields with default
   values (0, nil, &c) that satisfy types but not invariants.
   Allow objects to be named outside their class in partially
   initialised states that require 'initialisation phase' methods
   to be called before 'work phase' methods, but do not check
   that this has been done.

   Allow 'initialisation phase' methods to be called at any time
   without checking it it makes sense, allowing even well-initialised
   objects to subsequently be put into inconsistent states.

Much as I love Smalltalk, a lot of the code that I see using the
create-set-call pattern is actually doing the >bad< version.
Here's an example that took only 2 minutes to find.
In Pharo 1.1 (which is not the current version)
        Url new<cmd-P>
which is the equivalent of System.out.println(new Url())
raises an exception.  Url new created an uninitialised object.
That's _almost_ fair enough:  this is supposed to be an abstract
class, but it should have been caught in #new.  Go to a concrete
subclass:
        FileUrl new<cmd-P>
also raises an exception, trying to print the elements of a nil
String.  

> However, I don't think our result is unsurprising,

All I can say is that it didn't surprise _me_.
Compare for example
        f = fopen(x, y); /*C*/
        s := FileStream read: x. "ST"

        s = new FileStream(y, x); //C#

Smalltalk: obvious what it does.
C: which argument is which?  Compiler can't help.
C#: three different things it could be; the compiler
can tell them apart, but it's not so easy for people.
And when you get to

        new FileStream(String, FileMode, FileSystemRights,
                FileShare, Int32, FileOptions)

this is obviously going to be a lot harder for people to
read than
        FileStream read: string rights: rights share: share
                bufferSize: int32 options: options

If you *could* do
        s = new FileStream()
                .FileName(string)
                .Rights(rights)
                .Share(share)
                .BufferSize(int32)
                .Options(options)
                .Open();
that would be a lot clearer.

And that introduces a fourth design pattern, which the paper did
not investigate, call it "initializer object".

The general scheme for Initializer Object is
        class X has a static Maker() method
        returning an instance of X_Maker().

        X_Maker() has methods like
                Facet(value)
        returning the same X_Maker() object
        and a completion method called something
        like    Open()
        or      Create()
        that returns a fully initialised instance of X.

This requires a creation style like

        s = FileStream.maker()
                .FileName(string) &c as before
                .Open();


> You've highlighted the core of the debate when, referring to the people who 
> prefer the create-set-call pattern, you said " the idea of them writing any 
> code that might affect my life or the life of anyone known to me is not one 
> that's going to help me sleep at night ". This was also the initial reaction 
> of many people inside Microsoft when they heard the results of our study. 
> 
> Our response in this debate has always been that different programmers 
> require different APIs. That's the message we tried to communicate in the 
> paper when we described the different personas. These personas represent 
> different workstyles of the developers we have observed using the .Net 
> framework. They are a crucial tool in our ability to successfully design a 
> framework that is broadly usable by millions of developers.

The paper didn't just say that these programmers didn't LIKE or weren't 
COMFORTABLE
using full-initialisation constructors, but that they didn't really get the 
idea.
Is it really a good idea to design a framework that is (ab)usable by people who
probably shouldn't be programming in the first place?

Less dismissively:  would these people have got the idea of Initializer Object?

> One of the hardest things to do in order to use these workstyles successfully 
> to design an API is for the API designer to let go of their own biases 
> towards how things should be done, and instead, design the API based on a 
> deep understanding of how the user expects things to be done.

You can call it a bias if you want.  I call it sheer terror based on seeing it
done wrong (as in: programs crash) far too often.

The paper did not suggest to me that a deep understanding of what those users
thought was happening had been reached, or even sought.  In particular, may I
offer the MacDonalds analogy?  You go to a MacDonalds, and you tell them what
you want.  As you say
        - I want a hamburger
        - I want the mighty angus
        - no onion or pickles
do you see this?
        * a hamburger bun materialises in front of you
        * it's filled with the beef and trimmings
        * the onion and pickles are taken away
No.  They *take your order*, and then deliver a complete hamburger not entirely
unlike the way you want it, and then you eat it.  You can't eat an incompletely
assembled hamburger, because they don't give you one.

In the Initializer Object pattern, the initializer object is like the order the
person behind the counter is filling out.  When you say "that's it" and pay,
_that's_ when they select or assemble the hamburger and deliver it to you.  Up
to that point, you can revise the order if you want to.

Isn't it at least possible that what the people who use new _() and then
fill out their order might prefer Initializer Object?  Surely there's no 
shortage
of Windows developers who have bought a hamburger...

A fifth design pattern could be called "Lazy Biphasic Object".

A Biphasic object is one with (at least) two distinct states:
initialisation phase, where various facets can be set up, and
operational phase(s), where the object does whatever you really
wanted it to do.  Methods may be classified as
 - initialisation only
 - operation only
 - multiphase

Lazy Biphasic Object is where the object changes from initialisation
phase to operational phase the first time an operation only method is
called.  The best known instance of this is C FILE objects, where
setbuf() and setvbuf() are initialisation only methods, and getc()
and putc() are operation only methods, and the first time you call
getc() or putc() the initialisation process is only then completed.

Are the programmers who are only comfortable with create-set-call
really thinking in terms of lazy biphasic objects?  Would the API
be better if designed that way?  Would they mind at all if the
transition were explicit rather than implicit?

> That doesn't mean that we always do everything the way that users expect them 
> to be done but it does mean that when we decide to do something that differs 
> from their expectations we do it consciously and deliberately.

It also means that you need to know what it REALLY is that they are expecting.
Maybe you found that out, but the paper didn't _say_.  There is a big
difference between create-set-call where the object is *always* ready for use
and lazy biphasic object where it is an error to call an initialisation-only
method in operational phase.

> In this case, the understanding we had gained from the study reported in this 
> paper and from many other studies we had run internally (at one point we were 
> running API user experience studies on the .Net framework monthly) indicated 
> that we needed to design the API to accommodate the users preference for 
> initializing objects since the alternative was that many developers would 
> have a very difficult time using APIs that were designed differently to their 
> expectations.

You offer users an *extremely* limited choice, and then talk about what
they did as their *preference*?  The paper didn't mention the Initialiser Object
approach:  how do you know they would not have preferred that?
The paper didn't mention explicit biphasic object:  how do you know they would
not have preferred that?  It didn't mention lazy biphasic object (with 
exceptions
raised for out-of-phase invocations).  How do you know they would not have
preferred that?  The paper did not mention single-point-of-construction with
keyword arguments (or even passing a dictionary, as you might do in Python).
How do you know people would not have preferred that?

Oh, it's great that you did the experiment, and really great that you wrote it
up, but the range of alternatives considered was _far_ too small to base any
far-reaching decisions on.

Again, I repeat: whatever you actually found out, the *paper* does not tell me
what those users actually thought was happening or wanted to happen, only which
notation fitted those thoughts less badly.
> 
> You're right again in saying that we knew these workstyles existed before the 
> study. Our paper describes the different ways that developers exhibiting 
> these workstyles prefer to initialize objects. This was useful information 
> for us at Microsoft in determining at that time, how best to accommodate the 
> preferences of many of our customers.
> 
> It's interesting to note that the opportunistic workstyle has since been 
> observed and studied by others, most notably by Scott Klemmer's group at 
> Stanford: http://hci.stanford.edu/research/opportunistic/

Let me quote that page:

        Opportunistic Programming is a method of software development
        that emphasizes speed and ease of development over code robustness
        and maintainability.

Sheil had a "Power tools for programmers" paper in the early 1980s arguing for
"exploratory programming" and how to support it.  (Of course Stanford was in
touch with work on Lisp and Smalltalk at Xerox for a long time, so this old 
topic
should have been very well known at Stanford.)  

As a programmer, I want to go at top speed.

As a user of other people's programs, I am heartily *sick* of
programs where insufficient attention was paid to robustness.
Just yesterday I was trying to help a student debug a program
in a language not entirely unlike C# where if he started n copies
of his program several seconds apart, all went well, but start
them in a shell loop and a random copy would get a completely
black window.  I honestly could not find anything in _his_ code
to justify this.  This was _not_ a good learning experience for him.



-- 
The Open University is incorporated by Royal Charter (RC 000391), an exempt 
charity in England & Wales and a charity registered in Scotland (SC 038302).

Re: studies of naming?

Reply via email to