ctubbsii opened a new issue, #5845:
URL: https://github.com/apache/accumulo/issues/5845

   **Is your feature request related to a problem? Please describe.**
   
   This addresses several problems:
   
   1. We have too much configuration that could be expressed more succinctly, 
making it easier for users to maintain,
   2. Some server types don't have a port search capability (manager, gc, 
monitor), and some of those should (manager, gc),
   3. The port search properties don't have a documented upper limit, so we 
just hard-code in a limit of 1000, but depending on the starting port, that 
could exceed the max port. This is unintuitive, and unnecessarily complex.
   4. The current unbounded way of searching ports makes it hard for users to 
configure and lock down their firewalls, because they can't easily predict 
which port will be used,
   5. We now have `rpc.bind.addr` and `rpc.advertise.addr` that interact with 
the per-server port addresses in potentially confusing ways.
   
   **Describe the solution you'd like**
   
   Delete individual server port configs, and use a single property for a 
global shared port search range.
   
   Create `rpc.bind.port` that is a port range type, using a syntax like `9995` 
(single) or `9995-10995` (inclusive-inclusive range). It could also, 
optionally, support ranges like `[9995, 10995)` (alternative syntax that 
supports inclusive and exclusive bounds).
   
   Have a default value that is plenty big enough for all servers in a small 
cluster, like `19000-19999`. Override it with `-o rpc.bind.port=9995` in the 
`conf/accumulo-env.sh` for the monitor, and any other process that requires a 
well-known address for users. Users can also override the value on a per-server 
basis to get mutually exclusive ranges for different server types, or 
differently sized ranges, if they use different `accumulo.properties` files or 
use `rpc.bind.port` with different values in their `accumulo-env.sh`.
   
   Users could also use a sequence number to make predictable ports for 
tserver1, tserver2, etc. for the same host by choosing a port derived from the 
sequence number when starting more than one server of the same type. So, rather 
than these choosing the next port in the range, you could, for example, make 
tserver1 always be 10001, and tserver2 always be 10002, etc.
   
   This would bring the port configuration in line with the bind/advertise 
address configs, and reduce the total number of configs needed for a user to 
set.
   
   **Describe alternatives you've considered**
   
   1. Do nothing and deal with it. Maybe it's fine.
   2. Removing the port search properties (leave it always on) and supporting 
ranges in each of the `.port.search` properties. However, that doesn't simplify 
the config as much. It does, however, leave us open to having more than one 
port per server, for example for a port type other than the single `.client` 
ports we have today. I don't anticipate we need that, and the `rpc.bind.port` 
solution doesn't completely prevent the creation of additional secondary port 
properties... it just means we need to think about it if we end up in a 
situation where we need more ports for a particular server type.
   3. Instead of creating a new property, use the same `rpc.bind.addr` 
property, but with a format like `host:portRange`, as in `localhost:3000-4000` 
or `localhost:[4000,5000)`. Would have to be careful not to use square brackets 
in a way that makes it harder to resolve a host address with square brackets 
like how it is common with IPv6 addresses, as in `[::1]:443`, for example.
   
   **Additional context**
   
   * Guava's Range type may be useful for serializing/deserializing ranges 
internally.
   * Java's IntStream may be useful for manipulating ranges internally.
   * The serialized format could be a simple `inclusiveA-inclusiveB` format as 
I've shown above, where `B > A && 1024 < A < 65535 && 1024 < B <= 54535`, or it 
can be expressed in a more mathematical way, like `[A, B]` or `[A, B)` to 
represent inclusive or exclusive on the bounds. We could choose one format for 
simplicity, or support both, so the user can decide and we will still know what 
they mean.
   * A single integer value, `A`, should be interpreted as the range `[A, 
A+1)`, and implies no port searching.... use that one value or fail.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to