elek commented on a change in pull request #2136:
URL: https://github.com/apache/ozone/pull/2136#discussion_r621074907
##########
File path:
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/shell/keys/CopyKeyHandler.java
##########
@@ -82,17 +84,19 @@ protected void execute(OzoneClient client, OzoneAddress
address)
OzoneVolume vol = client.getObjectStore().getVolume(volumeName);
OzoneBucket bucket = vol.getBucket(bucketName);
- if (replicationFactor == null) {
- replicationFactor = ReplicationFactor.valueOf(
- getConf().getInt(OZONE_REPLICATION, OZONE_REPLICATION_DEFAULT));
- }
-
if (replicationType == null) {
replicationType = ReplicationType.valueOf(
- getConf().get(OZONE_REPLICATION_TYPE,
- OZONE_REPLICATION_TYPE_DEFAULT));
+ getConf()
+ .get(OZONE_REPLICATION_TYPE, OZONE_REPLICATION_TYPE_DEFAULT));
}
+ if (replication == null) {
Review comment:
That's a good question (thanks for it), and I think there are multiple
parts.
1. This is only for `ozone sh` commands not for s3 (where we have storage
classes) or for o3fs/ofs (where we plan to configure it cluster wide or bucket
level.
2. Originally I had the same idea. `ReplicationConfig` today can have
multiple parameters. (For example `3:2` can be defined as `new
ECReplicationConfig(3,2)`). I thought that it might be better to accept only
enums (or pre-defined strings) which are further defined by the administrators
(or hard-coded by the developers):
For example:
| replication type | enum / group name | replication config |
|----------------------------|---|---|
| RATIS | ONE | `new RatisReplicationConfig(ONE) |
| RATIS | THREE | `new RatisReplicationConfig(THREE) |
| STANDALONE | ONE | `new StandaloneReplicationConfig(ONE) |
| EC | `3:2` | `new EcReplicationConfig(3:2)` |
| EC | `6:4` | `new EcReplicationConfig(6:4)` |
| EC | `10:3` | `new EcReplicationConfig(10:3)` |
But when we add the `codec` to `ECReplicationConfig` (`RS` vs `XOR`) it will
create more and more entries in this table.
1. I think (at least at long term) administrators should have the
flexibility to define new groups (shouldn't be everything hardcoded in the code)
2. I think it's an additional complexity to manage all of these mappings.
As `ozone sh` is a low level cli tool, it seems to be easier to avoid this
grouping and just enable to directly configure all possible parameters
(especially as there is no limit on the SCM/OM side, we can handle different
configuration without any problem)
3. Compatibility
I also felt that this approach is a more natural extension to the current
scheme. We can introduce a new abstraction level (with new cli parameters like
`--storage-type`) where we accept names instead of low level parameters:
| name | replication type | replication config |
|----------------------------|---|---|
| REDUCED | RATIS | `new RatisReplicationConfig(ONE) |
| DEFAULT | RATIS | | `new RatisReplicationConfig(THREE) |
| STANDALONE | `new StandaloneReplicationConfig(ONE) |
| EC_3_2 | EC | `new EcReplicationConfig(3:2)` |
But I am not sure if we need this abstraction level, especially at the
beginning of the feature branch.
4. Enum vs string
When you asked about enum, we can talk about two parts: do we need
pre-defined groups AND/OR do we need to store in enum the groups?
I learned with the port changes that enum is very dangerous. We introduced a
new service on DN and added a new enum value for the service discovery API and
introduced a lot of backward compatibility problem (new enum values couldn't be
deserialized with old clients). String seems to be more flexible (we can
validate it after protobuf conversion).
This example doesn't fully apply here, as it's a full client-only change.
(on protobuf this string is not sent), but I learned that enums (in some cases)
can be harder
TLDR: This is why I think the string based approach is better here, but
these are somewhat subjective points, please add your opinion.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]