smengcl opened a new pull request, #4420:
URL: https://github.com/apache/ozone/pull/4420

   ## What changes were proposed in this pull request?
   
   1. Print proper JSON **object** (`{ }`) if `--with-keys=true`.
   2. Print proper JSON **array** (`[ ]`) if `--with-keys=false`.
   3. `--with-keys` now defaults to `true`. Tweak some option names. Improve 
error messages. Should have no compatiblity concern what-so-ever as this is 
intended to be a debug tool.
   4. Rewritten and parameterized `TestLDBCli`. New test cases are added.
   5. Refactor `DBScanner` for readability and maintainability.
   
   ### Note
   
   Regarding the core serialization logic in `DBScanner`, I've chosen to stick 
to the current `Gson` approach that serializes **each** entry and immediately 
printing it. It should consume less memory than [gathering all 
entries](https://github.com/apache/ozone/blob/04cd54ce6593024dc98a9867e1fd829c4f25f85a/hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/debug/DBScanner.java#L129-L135)
 then serializing and printing it (could OOM if batch limit is too high like a 
few billion entries, while the current approach should work just fine).
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-6064
   
   ## How was this patch tested?
   
   - Rewritten `TestLDBCli` and added new test cases.
     - Fully migrated to JUnit 5.
     - Fully parameterized tests. Able to easily add more params combination in 
the future.
       - Use `Named.of()` to describe each parameter for maintainability.
       - Pls feel free to contribute more test cases now that adding a new case 
is just a matter of a few lines. :)
     - Rewritten `keyTable` tests, which is heavily inspired by @adoroszlai 's 
change in #2917 (in which `testOMDB` is split into multiple test cases).
     - Datanode DB `block_data` table schema V3 and V2 tests are completely 
rewritten.
     - This took longer for me than fixing and refactoring `DBScanner` :D
   
   <img width="605" alt="IntelliJ" 
src="https://user-images.githubusercontent.com/50227127/226076360-2c518e58-77b6-4e83-a00b-138ec57d7c79.png";>
   
   ## Future potential improvement
   
   - [ ] For SchemaV3 `block_data` table, we could further nest entries inside 
another map. With the outer-most layer being `containerId`.
   
   e.g. Currently (unchanged in this PR), the JSON key for an entry inside V3 
`block_data` is `containerId: blockId`:
   
   ```shell
   { "2: 3": {
     "blockID": {
       "containerBlockID": {
         "containerID": 2,
         "localID": 3
       },
       "blockCommitSequenceId": 0
     },
     "metadata": {},
     "size": 0
   }, "2: 4": {
     "blockID": {
       "containerBlockID": {
         "containerID": 2,
         "localID": 4
       },
       "blockCommitSequenceId": 0
     },
     "metadata": {},
     "size": 0
   } }
   ```
   
   With another layer, the output could become even cleaner, making it easier 
to be filtered using `jq`:
   
   ```shell
   {
     "2": {
       "3": {
         "blockID": {
           "containerBlockID": {
             "containerID": 2,
             "localID": 3
           },
           "blockCommitSequenceId": 0
         },
         "metadata": {},
         "size": 0
       },
       "4": {
         "blockID": {
           "containerBlockID": {
             "containerID": 2,
             "localID": 4
           },
           "blockCommitSequenceId": 0
         },
         "metadata": {},
         "size": 0
       }
     }
   }
   ```
   
   ## Example output with this PR
   
   Outputs are from integration test's `stdout` with dummy data for demo 
purpose. Command lines are reconstructed from test parameters. Actual CLI 
output can differ.
   
   ### `ozone debug ldb --db=/data/metadata/om.db scan --column-family=keyTable 
--limit=1`
   
   ```shell
   { "key1": {
     "volumeName": "vol1",
     "bucketName": "buck1",
     "keyName": "key1",
     "dataSize": 1000,
     "keyLocationVersions": [
       {
         "version": 0,
         "locationVersionMap": {
           "0": []
         },
         "isMultipartKey": false
       }
     ],
     "creationTime": 1679105144793,
     "modificationTime": 1679105144793,
     "replicationConfig": {
       "replicationFactor": "ONE"
     },
     "isFile": false,
     "fileName": "key1",
     "acls": [],
     "parentObjectID": 0,
     "objectID": 0,
     "updateID": 0,
     "metadata": {}
   } }
   ```
   
   ### `ozone debug ldb --db=/data/metadata/om.db scan --column-family=keyTable`
   
   ```shell
   { "key1": {
     "volumeName": "vol1",
     "bucketName": "buck1",
     "keyName": "key1",
     "dataSize": 1000,
     "keyLocationVersions": [
       {
         "version": 0,
         "locationVersionMap": {
           "0": []
         },
         "isMultipartKey": false
       }
     ],
     "creationTime": 1679102602165,
     "modificationTime": 1679102602166,
     "replicationConfig": {
       "replicationFactor": "ONE"
     },
     "isFile": false,
     "fileName": "key1",
     "acls": [],
     "parentObjectID": 0,
     "objectID": 0,
     "updateID": 0,
     "metadata": {}
   }, "key2": {
     "volumeName": "vol1",
     "bucketName": "buck1",
     "keyName": "key2",
     "dataSize": 1000,
     "keyLocationVersions": [
       {
         "version": 0,
         "locationVersionMap": {
           "0": []
         },
         "isMultipartKey": false
       }
     ],
     "creationTime": 1679102602279,
     "modificationTime": 1679102602279,
     "replicationConfig": {
       "replicationFactor": "ONE"
     },
     "isFile": false,
     "fileName": "key2",
     "acls": [],
     "parentObjectID": 0,
     "objectID": 0,
     "updateID": 0,
     "metadata": {}
   }, "key3": {
     "volumeName": "vol1",
     "bucketName": "buck1",
     "keyName": "key3",
     "dataSize": 1000,
     "keyLocationVersions": [
       {
         "version": 0,
         "locationVersionMap": {
           "0": []
         },
         "isMultipartKey": false
       }
     ],
     "creationTime": 1679102602282,
     "modificationTime": 1679102602282,
     "replicationConfig": {
       "replicationFactor": "ONE"
     },
     "isFile": false,
     "fileName": "key3",
     "acls": [],
     "parentObjectID": 0,
     "objectID": 0,
     "updateID": 0,
     "metadata": {}
   }, "key4": {
     "volumeName": "vol1",
     "bucketName": "buck1",
     "keyName": "key4",
     "dataSize": 1000,
     "keyLocationVersions": [
       {
         "version": 0,
         "locationVersionMap": {
           "0": []
         },
         "isMultipartKey": false
       }
     ],
     "creationTime": 1679102602284,
     "modificationTime": 1679102602284,
     "replicationConfig": {
       "replicationFactor": "ONE"
     },
     "isFile": false,
     "fileName": "key4",
     "acls": [],
     "parentObjectID": 0,
     "objectID": 0,
     "updateID": 0,
     "metadata": {}
   }, "key5": {
     "volumeName": "vol1",
     "bucketName": "buck1",
     "keyName": "key5",
     "dataSize": 1000,
     "keyLocationVersions": [
       {
         "version": 0,
         "locationVersionMap": {
           "0": []
         },
         "isMultipartKey": false
       }
     ],
     "creationTime": 1679102602286,
     "modificationTime": 1679102602286,
     "replicationConfig": {
       "replicationFactor": "ONE"
     },
     "isFile": false,
     "fileName": "key5",
     "acls": [],
     "parentObjectID": 0,
     "objectID": 0,
     "updateID": 0,
     "metadata": {}
   } }
   ```
   
   ### `ozone debug ldb --db=/data/metadata/om.db scan --column-family=keyTable 
--limit=2 --with-keys=false`
   
   ```shell
   
   ```
   
   ### `ozone debug ldb --db=/data/hdds/hdds/CID-UUID1/DS-UUID2/container.db 
scan --column-family=block_data --dn-schema=V3 --container-id=2 --limit=2`
   
   ```shell
   { "2: 3": {
     "blockID": {
       "containerBlockID": {
         "containerID": 2,
         "localID": 3
       },
       "blockCommitSequenceId": 0
     },
     "metadata": {},
     "size": 0
   }, "2: 4": {
     "blockID": {
       "containerBlockID": {
         "containerID": 2,
         "localID": 4
       },
       "blockCommitSequenceId": 0
     },
     "metadata": {},
     "size": 0
   } }
   ```
   
   ### `ozone debug ldb --db=/data/hdds/hdds/CID-UUID1/DS-UUID2/container.db 
scan --column-family=block_data --dn-schema=V2 --limit=4`
   
   ```shell
   { "1": {
     "blockID": {
       "containerBlockID": {
         "containerID": 1,
         "localID": 1
       },
       "blockCommitSequenceId": 0
     },
     "metadata": {},
     "size": 0
   }, "2": {
     "blockID": {
       "containerBlockID": {
         "containerID": 1,
         "localID": 2
       },
       "blockCommitSequenceId": 0
     },
     "metadata": {},
     "size": 0
   }, "3": {
     "blockID": {
       "containerBlockID": {
         "containerID": 2,
         "localID": 3
       },
       "blockCommitSequenceId": 0
     },
     "metadata": {},
     "size": 0
   }, "4": {
     "blockID": {
       "containerBlockID": {
         "containerID": 2,
         "localID": 4
       },
       "blockCommitSequenceId": 0
     },
     "metadata": {},
     "size": 0
   } }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to