ctubbsii opened a new issue, #5845: URL: https://github.com/apache/accumulo/issues/5845
**Is your feature request related to a problem? Please describe.** This addresses several problems: 1. We have too much configuration that could be expressed more succinctly, making it easier for users to maintain, 2. Some server types don't have a port search capability (manager, gc, monitor), and some of those should (manager, gc), 3. The port search properties don't have a documented upper limit, so we just hard-code in a limit of 1000, but depending on the starting port, that could exceed the max port. This is unintuitive, and unnecessarily complex. 4. The current unbounded way of searching ports makes it hard for users to configure and lock down their firewalls, because they can't easily predict which port will be used, 5. We now have `rpc.bind.addr` and `rpc.advertise.addr` that interact with the per-server port addresses in potentially confusing ways. **Describe the solution you'd like** Delete individual server port configs, and use a single property for a global shared port search range. Create `rpc.bind.port` that is a port range type, using a syntax like `9995` (single) or `9995-10995` (inclusive-inclusive range). It could also, optionally, support ranges like `[9995, 10995)` (alternative syntax that supports inclusive and exclusive bounds). Have a default value that is plenty big enough for all servers in a small cluster, like `19000-19999`. Override it with `-o rpc.bind.port=9995` in the `conf/accumulo-env.sh` for the monitor, and any other process that requires a well-known address for users. Users can also override the value on a per-server basis to get mutually exclusive ranges for different server types, or differently sized ranges, if they use different `accumulo.properties` files or use `rpc.bind.port` with different values in their `accumulo-env.sh`. Users could also use a sequence number to make predictable ports for tserver1, tserver2, etc. for the same host by choosing a port derived from the sequence number when starting more than one server of the same type. So, rather than these choosing the next port in the range, you could, for example, make tserver1 always be 10001, and tserver2 always be 10002, etc. This would bring the port configuration in line with the bind/advertise address configs, and reduce the total number of configs needed for a user to set. **Describe alternatives you've considered** 1. Do nothing and deal with it. Maybe it's fine. 2. Removing the port search properties (leave it always on) and supporting ranges in each of the `.port.search` properties. However, that doesn't simplify the config as much. It does, however, leave us open to having more than one port per server, for example for a port type other than the single `.client` ports we have today. I don't anticipate we need that, and the `rpc.bind.port` solution doesn't completely prevent the creation of additional secondary port properties... it just means we need to think about it if we end up in a situation where we need more ports for a particular server type. 3. Instead of creating a new property, use the same `rpc.bind.addr` property, but with a format like `host:portRange`, as in `localhost:3000-4000` or `localhost:[4000,5000)`. Would have to be careful not to use square brackets in a way that makes it harder to resolve a host address with square brackets like how it is common with IPv6 addresses, as in `[::1]:443`, for example. **Additional context** * Guava's Range type may be useful for serializing/deserializing ranges internally. * Java's IntStream may be useful for manipulating ranges internally. * The serialized format could be a simple `inclusiveA-inclusiveB` format as I've shown above, where `B > A && 1024 < A < 65535 && 1024 < B <= 54535`, or it can be expressed in a more mathematical way, like `[A, B]` or `[A, B)` to represent inclusive or exclusive on the bounds. We could choose one format for simplicity, or support both, so the user can decide and we will still know what they mean. * A single integer value, `A`, should be interpreted as the range `[A, A+1)`, and implies no port searching.... use that one value or fail. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
