In response to renewed attention at the Foundation toward addressing
culturally problematic language and terms often used in technical
documentation and discussion, several projects have begun discussions, or
made proposals, or started work along these lines.

The HBase PMC began its own discussion on private@ on June 9, 2020 with an
observation of this activity and this suggestion:

There is a renewed push back against classic technology industry terms that
have negative modern connotations.

In the case of HBase, the following substitutions might be proposed:

- Coordinator instead of master

- Worker instead of slave

Recommendations for these additional substitutions also come up in this
type of discussion:

- Accept list instead of white list

- Deny list instead of black list

Unfortunately we have Master all over our code base, baked into various
APIs and configuration variable names, so for us the necessary changes
amount to a new major release and deprecation cycle. It could well be worth
it in the long run. We exist only as long as we draw a willing and
sufficient contributor community. It also wouldn’t be great to have an
activist fork appear somewhere, even if unlikely to be successful.

Relevant JIRAs are:

   - HBASE-12677 <https://issues.apache.org/jira/browse/HBASE-12677>:
   Update replication docs to clarify terminology
   - HBASE-13852 <https://issues.apache.org/jira/browse/HBASE-13852>:
   Replace master-slave terminology in book, site, and javadoc with a more
   modern vocabulary
   - HBASE-24576 <https://issues.apache.org/jira/browse/HBASE-24576>:
   Changing "whitelist" and "blacklist" in our docs and project

In response to this proposal, a member of the PMC asked if the term
'master' used by itself would be fine, because we only have use of 'slave'
in replication documentation and that is easily addressed. In response to
this question, others on the PMC suggested that even if only 'master' is
used, in this context it is still a problem.

For folks who are surprised or lacking context on the details of this
discussion, one PMC member offered a link to this draft RFC as background:
https://tools.ietf.org/id/draft-knodel-terminology-00.html

There was general support for removing the term "master" / "hmaster" from
our code base and using the terms "coordinator" or "leader" instead. In the
context of replication, "worker" makes less sense and perhaps "destination"
or "follower" would be more appropriate terms.

One PMC member's thoughts on language and non-native English speakers is
worth including in its entirety:

While words like blacklist/whitelist/slave clearly have those negative
references, word master might not have the same impact for non native
English speakers like myself where the literal translation to my mother
tongue does not have this same bad connotation. Replacing all references
for word *master *on our docs/codebase is a huge effort, I guess such a
decision would be more suitable for native English speakers folks, and
maybe we should consider the opinion of contributors from that ethinic
minority as well?

These are good questions for public discussion.

We have a consensus in the PMC, at this time, that is supportive of making
the above discussed terminology changes. However, we also have concerns
about what it would take to accomplish meaningful changes. Several on the
PMC offered support in the form of cycles to review pull requests and
patches, and two PMC members offered  personal bandwidth for creating and
releasing new code lines as needed to complete a deprecation cycle.

Unfortunately, the terms "master" and "hmaster" appear throughout our code
base in class names, user facing API subject to our project compatibility
guidelines, and configuration variable names, which are also implicated by
compatibility guidelines given the impact of changes to operators and
operations. The changes being discussed are not backwards compatible
changes and cannot be executed with swiftness while simultaneously
preserving compatibility. There must be a deprecation cycle. First, we must
tag all implicated public API and configuration variables as deprecated,
and release HBase 3 with these deprecations in place. Then, we must
undertake rename and removal as appropriate, and release the result as
HBase 4.

One PMC member raised a question in this context included here in entirety:

Are we willing to commit to rolling through the major versions at a pace
that's necessary to make this transition as swift as
reasonably possible?

This is a question for all of us. For the PMC, who would supervise the
effort, perhaps contribute to it, and certainly vote on the release
candidates. For contributors and potential contributors, who would provide
the necessary patches. For committers, who would be required to review and
commit the relevant changes.

Although there has been some initial discussion, there is no singular
proposal, or plan, or set of decisions made at this time. Wrestling with
this concern and the competing concerns involved with addressing it
(motivation for change versus motivation for compatibility) is a task for
all of us to undertake (or not) in public on dev@ and user@.

Reply via email to