Naming things

      There are only two hard things
      in Computer Science: cache
      invalidation and naming things.
      -- Phil Karlton

Hello,

There has been a lot of recent debate regarding the names of some new functions recently added to Phobos.

Mostly this concerns the good work of Walter Bright, and new functions which operate on ranges rather than strings.

A few times, many people felt that the names could have been chosen better - in some cases, much better. This is not a singular occurrence, but a recurrent one:

http://forum.dlang.org/post/ybbwpgmgsqvmbvoqh...@forum.dlang.org
https://github.com/D-Programming-Language/phobos/pull/2149#issuecomment-42867964

So far, post-merge name changes have been rejected:

https://github.com/D-Programming-Language/phobos/pull/3243
https://github.com/D-Programming-Language/phobos/pull/3426

Two examples of controversial name pairs: setExt/setExtension, and toLower/toLowerCase. These functions have the same functionality, but one of them is eager, and the other is lazy. Can you guess which is which?

I would like to argue that these rejections were poorly argumented and should have been accepted. Although the names of these particular functions are the primary point of the post, I would like to also discuss the general policy of minor changes.

I have discussed the issue at hand with Andrei Alexandrescu on IRC. Here are some points gathered:

1.

The renames do not apply to code that appeared in a DMD release, thus they are non-breaking changes.

As the actual code has been merged, the renames are not blocking anything either.

2.

There seems to be some confusion regarding what counts as consensus.

Walter Bright argues that there is no consensus regarding the new names. I would like to split this argument into two questions:

a) Is there consensus that the current names are very bad, and should be changed?

b) Is there consensus on which name to use?

These two questions must not be confused.

I think there is sufficient evidence that shows that everyone who has an opinion on the names, agrees that the current names are pretty bad.

What the new names should be is of secondary importance, as long as the names are changed to any of the suggested names.

3.

The main argument against allowing post-merge renames is that allowing one invites an infinite number of other minor changes. I think this is not a good argument, because:

- In this particular case, there is unilateral consensus that the current names are objectively bad, and should be changed. There are no arguments that show that e.g. "setExt" is a better name than e.g. "withExtension". I see no problem with not acting on renaming suggestions in situations when there is no consensus.

- Naming things well matters. We need to treat renames in the same way as minor breaking changes. In the same way that we do not reject minor breaking fixes and improvements to the implementations of functions that have not yet been released, we should improve the naming of identifiers if there is consensus that the change is an improvement.

4.

I have often heard the argument that bikeshedding distracts from getting actual work done, or variations of such. I think this argument is flawed.

Discussions about minor things will continue regardless of whether small changes, such as renames, are rejected as a matter or policy or not. (Yes, this post is an example of this.) Yes, allowing some minor changes but not others will generate debate on why some changes were accepted and others not. Rejecting all minor changes does not prevent such debate from occurring, especially since there will always be exceptions (see e.g. std.meta).

I would thus like to argue that the policy of "no minor changes, even non-breaking" should be reviewed.

5.

Again, naming things well matters. An API with confusing or overlapping identifier names is a bad API. I've said this above but I want to say this again: we need to stop looking at renames as evil or as a waste of time, and look at them in the same way as (breaking) changes to the API / functionality. Just like API or functionality changes can be subjective in their usefulness, so can renames be controversial or overwhelmingly positive.

I do not disagree that how well identifiers are named is a secondary concern to the functionality that they provide. But this does not mean that we should ignore the quality of the names, and furthermore, reject any attempts to improve them.

6.

Concerning the naming itself.

My involvement comes from when my PR to rename setExt to withExtension was closed.

I would like to present a very similar case in another language, JavaScript.

The String method has two functions with a similar name and functionality: "substr" and "substring". If you were to search the web, you can find a multitude of confusion over these functions:

http://stackoverflow.com/questions/3745515/what-is-the-difference-between-substr-and-substring
https://nathanhoad.net/javascript-difference-between-substr-and-substring
http://javarevisited.blogspot.com/2013/08/difference-between-substr-vs-substring-in-JavaScript-tutorial-example.html
https://rapd.wordpress.com/2007/07/12/javascript-substr-vs-substring/
https://www.youtube.com/watch?v=OAameXW5r10

I think it's safe to say that it's one of JavaScript's many small warts.

The closest analogy with the case at hand is the toLower/toLowerCase functions. It would be unfortunate if we were to have such warts in D, so I think we should at least not outright reject PRs which fix them.

Reply via email to