smengcl opened a new pull request, #9781:
URL: https://github.com/apache/ozone/pull/9781

   ## What changes were proposed in this pull request?
   
   1. Catch `IOException` instead of `Exception` in `compactDB` and 
`triggerSnapshotDefrag`. Note: It is also not guaranteed that `IOException` 
would have detailed message (i.e. getMessage() can still return null).
   2. Add null check and fallback whenever `setErrorMsg(ex.getMessage())` was 
used in the code base.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-14649
   
   ## How was this patch tested?
   
   - Manually tested on a cluster (custom build):
   
   Before:
   
   ```bash
   $ sudo -u om ozone admin om snapshot defrag --service-id=ozone1771242317 
--node-id=om1546336036
   Triggering Snapshot Defrag Service ...
   com.google.protobuf.ServiceException: 
org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
java.lang.NullPointerException
        at 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerAdminProtocolProtos$TriggerSnapshotDefragResponse$Builder.setErrorMsg(OzoneManagerAdminProtocolProtos.java:5369)
        at 
org.apache.hadoop.ozone.protocolPB.OMAdminProtocolServerSideImpl.triggerSnapshotDefrag(OMAdminProtocolServerSideImpl.java:133)
        at 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerAdminProtocolProtos$OzoneManagerAdminService$2.callBlockingMethod(OzoneManagerAdminProtocolProtos.java:5549)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:995)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:923)
        at 
java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
        at java.base/javax.security.auth.Subject.doAs(Subject.java:439)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1910)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2905)
   , while invoking $Proxy20.triggerSnapshotDefrag over null. Retrying after 
sleeping for 1000ms.
   com.google.protobuf.ServiceException: 
org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
java.lang.NullPointerException
        at 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerAdminProtocolProtos$TriggerSnapshotDefragResponse$Builder.setErrorMsg(OzoneManagerAdminProtocolProtos.java:5369)
        at 
org.apache.hadoop.ozone.protocolPB.OMAdminProtocolServerSideImpl.triggerSnapshotDefrag(OMAdminProtocolServerSideImpl.java:133)
        at 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerAdminProtocolProtos$OzoneManagerAdminService$2.callBlockingMethod(OzoneManagerAdminProtocolProtos.java:5549)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:995)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:923)
        at 
java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
        at java.base/javax.security.auth.Subject.doAs(Subject.java:439)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1910)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2905)
   , while invoking $Proxy20.triggerSnapshotDefrag over null. Retrying after 
sleeping for 1000ms.
   ```
   
   After:
   
   ```bash
   $ sudo -u om ozone admin om snapshot defrag --service-id=ozone1771242317 
--node-id=om1546336036
   Triggering Snapshot Defrag Service ...
   Failed to trigger snapshot defragmentation: Failed to Decommission OM. 
Error: Request to trigger snapshot defragmentation, sent to 
om1546336036[ccycloud-8.quasar-quotgb.root.comops.site:9862] failed with error: 
java.lang.UnsupportedOperationException
        at org.apache.hadoop.hdds.utils.db.Codec.fromCodecBuffer(Codec.java:94)
        at 
org.apache.hadoop.hdds.utils.db.TypedTable$1.convert(TypedTable.java:603)
        at 
org.apache.hadoop.hdds.utils.db.TypedTable$RawIterator.next(TypedTable.java:684)
        at 
org.apache.hadoop.hdds.utils.db.TypedTable$RawIterator.next(TypedTable.java:635)
        at 
org.apache.hadoop.ozone.om.snapshot.defrag.SnapshotDefragService.getTableBounds(SnapshotDefragService.java:242)
        at 
org.apache.hadoop.ozone.om.snapshot.defrag.SnapshotDefragService.performFullDefragmentation(SnapshotDefragService.java:272)
        at 
org.apache.hadoop.ozone.om.snapshot.defrag.SnapshotDefragService.checkAndDefragSnapshot(SnapshotDefragService.java:615)
        at 
org.apache.hadoop.ozone.om.snapshot.defrag.SnapshotDefragService.triggerSnapshotDefragOnce(SnapshotDefragService.java:670)
        at 
org.apache.hadoop.ozone.om.OzoneManager.triggerSnapshotDefrag(OzoneManager.java:3518)
        at 
org.apache.hadoop.ozone.protocolPB.OMAdminProtocolServerSideImpl.triggerSnapshotDefrag(OMAdminProtocolServerSideImpl.java:130)
        at 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerAdminProtocolProtos$OzoneManagerAdminService$2.callBlockingMethod(OzoneManagerAdminProtocolProtos.java:5549)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:995)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:923)
        at 
java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
        at java.base/javax.security.auth.Subject.doAs(Subject.java:439)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1910)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2905)
   
   Failed to Decommission OM. Error: Request to trigger snapshot 
defragmentation, sent to 
om1546336036[ccycloud-8.quasar-quotgb.root.comops.site:9862] failed with error: 
java.lang.UnsupportedOperationException
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to