Naming things
There are only two hard things
in Computer Science: cache
invalidation and naming things.
-- Phil Karlton
Hello,
There has been a lot of recent debate regarding the names of some
new functions recently added to Phobos.
Mostly this concerns the good work of Walter Bright, and new
functions which operate on ranges rather than strings.
A few times, many people felt that the names could have been
chosen better - in some cases, much better. This is not a
singular occurrence, but a recurrent one:
http://forum.dlang.org/post/ybbwpgmgsqvmbvoqh...@forum.dlang.org
https://github.com/D-Programming-Language/phobos/pull/2149#issuecomment-42867964
So far, post-merge name changes have been rejected:
https://github.com/D-Programming-Language/phobos/pull/3243
https://github.com/D-Programming-Language/phobos/pull/3426
Two examples of controversial name pairs: setExt/setExtension,
and toLower/toLowerCase. These functions have the same
functionality, but one of them is eager, and the other is lazy.
Can you guess which is which?
I would like to argue that these rejections were poorly
argumented and should have been accepted. Although the names of
these particular functions are the primary point of the post, I
would like to also discuss the general policy of minor changes.
I have discussed the issue at hand with Andrei Alexandrescu on
IRC. Here are some points gathered:
1.
The renames do not apply to code that appeared in a DMD release,
thus they are non-breaking changes.
As the actual code has been merged, the renames are not blocking
anything either.
2.
There seems to be some confusion regarding what counts as
consensus.
Walter Bright argues that there is no consensus regarding the new
names. I would like to split this argument into two questions:
a) Is there consensus that the current names are very bad, and
should be changed?
b) Is there consensus on which name to use?
These two questions must not be confused.
I think there is sufficient evidence that shows that everyone who
has an opinion on the names, agrees that the current names are
pretty bad.
What the new names should be is of secondary importance, as long
as the names are changed to any of the suggested names.
3.
The main argument against allowing post-merge renames is that
allowing one invites an infinite number of other minor changes. I
think this is not a good argument, because:
- In this particular case, there is unilateral consensus that the
current names are objectively bad, and should be changed. There
are no arguments that show that e.g. "setExt" is a better name
than e.g. "withExtension". I see no problem with not acting on
renaming suggestions in situations when there is no consensus.
- Naming things well matters. We need to treat renames in the
same way as minor breaking changes. In the same way that we do
not reject minor breaking fixes and improvements to the
implementations of functions that have not yet been released, we
should improve the naming of identifiers if there is consensus
that the change is an improvement.
4.
I have often heard the argument that bikeshedding distracts from
getting actual work done, or variations of such. I think this
argument is flawed.
Discussions about minor things will continue regardless of
whether small changes, such as renames, are rejected as a matter
or policy or not. (Yes, this post is an example of this.) Yes,
allowing some minor changes but not others will generate debate
on why some changes were accepted and others not. Rejecting all
minor changes does not prevent such debate from occurring,
especially since there will always be exceptions (see e.g.
std.meta).
I would thus like to argue that the policy of "no minor changes,
even non-breaking" should be reviewed.
5.
Again, naming things well matters. An API with confusing or
overlapping identifier names is a bad API. I've said this above
but I want to say this again: we need to stop looking at renames
as evil or as a waste of time, and look at them in the same way
as (breaking) changes to the API / functionality. Just like API
or functionality changes can be subjective in their usefulness,
so can renames be controversial or overwhelmingly positive.
I do not disagree that how well identifiers are named is a
secondary concern to the functionality that they provide. But
this does not mean that we should ignore the quality of the
names, and furthermore, reject any attempts to improve them.
6.
Concerning the naming itself.
My involvement comes from when my PR to rename setExt to
withExtension was closed.
I would like to present a very similar case in another language,
JavaScript.
The String method has two functions with a similar name and
functionality: "substr" and "substring". If you were to search
the web, you can find a multitude of confusion over these
functions:
http://stackoverflow.com/questions/3745515/what-is-the-difference-between-substr-and-substring
https://nathanhoad.net/javascript-difference-between-substr-and-substring
http://javarevisited.blogspot.com/2013/08/difference-between-substr-vs-substring-in-JavaScript-tutorial-example.html
https://rapd.wordpress.com/2007/07/12/javascript-substr-vs-substring/
https://www.youtube.com/watch?v=OAameXW5r10
I think it's safe to say that it's one of JavaScript's many small
warts.
The closest analogy with the case at hand is the
toLower/toLowerCase functions. It would be unfortunate if we were
to have such warts in D, so I think we should at least not
outright reject PRs which fix them.