[jira] [Commented] (MESOS-2616) Update C++ style guide on variable naming.

Michael Park (JIRA) Fri, 17 Apr 2015 10:41:42 -0700

    [ 
https://issues.apache.org/jira/browse/MESOS-2616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500269#comment-14500269
 ]


Michael Park commented on MESOS-2616:
-------------------------------------

Sorry for the late response, but I'll gather my thoughts here. I think in the 
short run, this is going in the right direction. I think ultimately we need to 
accept that using camel case "as much as possible" is not a good naming 
convention in C++ and decide which naming schemes to assign to various groups 
that ought to be distinguished.

h3. A few groups to be assigned different naming schemes.

In terms of naming scheme in C++, it's very much worth categorizing the 
following groups.
* Types
* Local variables (including function parameters)
* Member variables
* Member-access functions

We outline why we want the above groups to have different naming schemes below.

h6. Separate types and (local/member) variables.

We want type names to have a different naming scheme than local/member 
variables because in many cases we name the variable after the type. e.g. a 
local variable {{Master master = new Master(...);}} and a member variable 
{{Master master;}}

h6. Separate member variables from member-access functions.

It's quite common expose member variables via a member-access functions 
(getters). Most commonly to provide constant access to the member variable. In 
which case we need to give them different names. e.g.

{code}
const Address& address() const { return address_; }
{code}

h6. Separate local variables (including function parameters) and member 
variables.

We want local variables and member variables to have different naming schemes 
to avoid shadowing completely. The issue of shadowing in constructor parameters 
have come up, so let's start there.

Consider the case where we have the same name.

{code}
class Obj {
  Obj(std::vector<int> elems) : elems(std::move(elems)) {}
  std::vector<int> elems;
};
{code}

In {{elems(std::move(elems))}}, the first {{elems}} refers to the member, and 
the second {{elems}} refers to the parameter. So this works correctly. Issues 
start if/when someone comes along and starts to fill in the body.

{code}
class Obj {
  Obj(std::vector<int> elems) : elems(std::move(elems)) {
    // Using 'elems' here refers to the parameter 'elem' rather than the member 
'elem' which is a moved-out object!
    // We'll probably get a segfault or even worse a silent incorrect behavior.
  }
  std::vector<int> elems;
};
{code}

However, the issue isn't special to constructors, nor is it special to 
parameters.

{code}
class Obj {
  void Fn(std::vector<int> elems) {
    // Referring to 'elems' here refers to the parameter rather than the member!
  }
  std::vector<int> elems;
};
{code}

{code}
class Obj {
  void Fn() {
    std::vector<int> elems;
    // Referring to 'elems' here refers to the local variable rather than the 
member!
  }
  std::vector<int> elems;
};
{code}

The 2 examples are small enough such that all of the uses of the name are in 
the same lexical scope, but imagine a class with many member variables with a 
long member function with intermediate local variables. {{DRFSorter}} is a good 
example of where this gets messy. {{DRFSorter}} has a member variable called 
{{resources}}, and has a bunch of functions that operate on {{Resources}} which 
ideally would just be named {{resources}}. Consider the following functions in 
{{DRFSorter}}.

{code}
void DRFSorter::unallocated(
    const string& name,
    const Resources& resources)
{
  allocations[name] -= resources;

  if (!dirty) {
    update(name);
  }
}

void DRFSorter::add(const Resources& _resources)
{
  resources += _resources;

  // We have to recalculate all shares when the total resources
  // change, but we put it off until sort is called
  // so that if something else changes before the next allocation
  // we don't recalculate everything twice.
  dirty = true;
}
{code}

In order to use {{resources}} as the parameter name for {{unallocated}} we must 
read the entire function and make sure that we don't want to refer to the 
member {{resources}}. Let's say someone comes along during their debugging 
session and tries to observe the state of the  member {{resources}}. If they 
print the state of {{resources}} inside {{unallocated}}, Surprise! They're 
observing the local {{resources}} rather than member {{resources}}.

h3. Naming schemes

Google, for example separates the categories with the following naming schemes:

* Types: title case, e.g. {{TitleCase}}
* Local variables (including function parameters): snake case, e.g. 
{{snake_case}}
* Member variables: snake case followed by a trailing underscore, e.g. 
{{snake_case_}}
* Member-access functions: snake_case, e.g. {{snake_case()}}

Another naming scheme that I've worked with is:

* Types: title case with a leading "T", e.g. {{TTitleCase}}
* Local variables (including function parameters): snake case, e.g. 
{{snake_case}}
* Member variables: title case, e.g. {{TitleCase}}
* Member-access functions: title case with a leading "Get", e.g. 
{{GetTitleCase}}

The point is that these naming schemes avoid the ambiguities and shadowing 
issues rather than having to reason through in a case-by-case basis.

In Mesos, we have:

* Types: title case, e.g. {{TitleCase}}
* Local variables (including function parameters): camel case, e.g. 
{{camelCase}}
* Member variables: camel case, e.g. {{camelCase}}
* Member-access functions: camel case, e.g. {{camelCase}}

The fact that we *try* to use camel case as much as possible leads to us having 
to disambiguate in a case-by-case manner is many places around the codebase.

Sometimes only a couple of them collide, sometimes all of them collide. For 
example, in {{3rdparty/libprocess/3rdparty/stout/include/stout/ip.hpp}}, we 
have all 3 colliding:

{code}
class IPNetwork
{
  public:
  IPNetwork(const IP& _address)
      : address_(_address) {}
  IP address() const { return address_; }
  IP address_;
};
{code}

h6. Conclusion

I think we should try to figure out naming schemes to assign these different 
groups rather than struggling to figure out how to disambiguate them in various 
case-by-case scenarios.

> Update C++ style guide on variable naming. 
> -------------------------------------------
>
>                 Key: MESOS-2616
>                 URL: https://issues.apache.org/jira/browse/MESOS-2616
>             Project: Mesos
>          Issue Type: Documentation
>            Reporter: Till Toenshoff
>            Assignee: Alexander Rukletsov
>            Priority: Minor
>
> Our variable naming guide currently is not really explaining use cases for 
> leading or trailing underscores as found a lot within our codebase. 
> We should correct that.
> The following was copied from the review description for allowing discussions 
> where needed:
> Documents the patterns we use to name variables and function arguments in our 
> codebase.
> h4.Leading underscores to avoid ambiguity.
> We use this pattern extensively in libprocess, stout and mesos, a few 
> examples below.
> * stout/try.hpp:105
> {noformat}
> Try(State _state, T* _t = NULL, const std::string& _message = "")
>   : state(_state), t(_t), message(_message) {}
> {noformat}
> * process/http.hpp:480
> {noformat}
>   URL(const std::string& _scheme,
>       const std::string& _domain,
>       const uint16_t _port = 80,
>       const std::string& _path = "/",
>       const hashmap<std::string, std::string>& _query =
>         (hashmap<std::string, std::string>()),
>       const Option<std::string>& _fragment = None())
>     : scheme(_scheme),
>       domain(_domain),
>       port(_port),
>       path(_path),
>       query(_query),
>       fragment(_fragment) {}
> {noformat}
> * slave/containerizer/linux_launcher.cpp:56
> {noformat}
> LinuxLauncher::LinuxLauncher(
>     const Flags& _flags,
>     int _namespaces,
>     const string& _hierarchy)
>   : flags(_flags),
>     namespaces(_namespaces),
>     hierarchy(_hierarchy) {}
> {noformat}
> h4.Trailing undescores as prime symbols.
> We use this pattern in the code, though not extensively. We would like to see 
> more pass-by-value instead of creating copies from a variable passed by const 
> reference.
> * master.cpp:2942
> {noformat}
> // Create and add the slave id.
> SlaveInfo slaveInfo_ = slaveInfo;
> slaveInfo_.mutable_id()->CopyFrom(newSlaveId());
> {noformat}
> * slave.cpp:4180
> {noformat}
> ExecutorInfo executorInfo_ = executor->info;
> Resources resources = executorInfo_.resources();
> resources += taskInfo.resources();
> executorInfo_.mutable_resources()->CopyFrom(resources);
> {noformat}
> * status_update_manager.cpp:474
> {noformat}
> // Bounded exponential backoff.
> Duration duration_ =
> std::min(duration * 2, STATUS_UPDATE_RETRY_INTERVAL_MAX);
> {noformat}
> * containerizer/mesos/containerizer.cpp:109
> {noformat}
> // Modify the flags to include any changes to isolation.
> Flags flags_ = flags;
> flags_.isolation = isolation;
> {noformat}
> h4.Passing arguments by value.
> * slave.cpp:2480
> {noformat}
> void Slave::statusUpdate(StatusUpdate update, const UPID& pid)
> {
>   ...
>   // Set the source before forwarding the status update.
>   update.mutable_status()->set_source(
>       pid == UPID() ? TaskStatus::SOURCE_SLAVE : TaskStatus::SOURCE_EXECUTOR);
>   ...
> }
> {noformat}
> * process/metrics/timer.hpp:103
> {noformat}
>   static void _time(Time start, Timer that)
>   {
>     const Time stop = Clock::now();
>     double value;
>     process::internal::acquire(&that.data->lock);
>     {
>       that.data->lastValue = T(stop - start).value();
>       value = that.data->lastValue.get();
>     }
>     process::internal::release(&that.data->lock);
>     that.push(value);
>   }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-2616) Update C++ style guide on variable naming.

Reply via email to