adoroszlai commented on code in PR #10578:
URL: https://github.com/apache/ozone/pull/10578#discussion_r3452869796
##########
hadoop-ozone/cli-repair/src/main/java/org/apache/hadoop/ozone/repair/om/FSORepairTool.java:
##########
@@ -88,6 +89,9 @@ public class FSORepairTool extends RepairTool {
private static final String REACHABLE_TABLE = "reachable";
private static final String PENDING_TO_DELETE_TABLE = "pendingToDelete";
+ @VisibleForTesting
+ static int tempDbBatchSize = 10_000;
Review Comment:
I think it would be better to add a CLI option for batch size:
- allow user to adjust it without rebuild (in case it is needed)
- avoids `@VisibleForTesting`
##########
hadoop-ozone/cli-repair/src/main/java/org/apache/hadoop/ozone/repair/om/FSORepairTool.java:
##########
@@ -546,23 +555,55 @@ private Collection<String>
getChildDirectoriesAndMarkAsPendingToDelete(String di
return childDirs;
}
+ /** Buffers writes to a temp.db table and flushes them in bounded batches.
*/
+ private final class BatchedTempWriter implements AutoCloseable {
+ private final Table<String, CodecBuffer> table;
+ private BatchOperation batch;
+ private int pending;
+
+ BatchedTempWriter(Table<String, CodecBuffer> table) {
+ this.table = table;
+ this.batch = tempDB.initBatchOperation();
+ }
+
+ void put(String key) throws IOException {
+ table.putWithBatch(batch, key, CodecBuffer.getEmptyBuffer());
+ if (++pending >= tempDbBatchSize) {
+ flush();
+ }
+ }
+
+ private void flush() throws IOException {
+ tempDB.commitBatchOperation(batch);
+ batch.close();
+ batch = tempDB.initBatchOperation();
+ pending = 0;
+ }
+
+ @Override
+ public void close() throws IOException {
+ if (pending > 0) {
+ tempDB.commitBatchOperation(batch);
+ }
+ batch.close();
Review Comment:
Should we set `pending = 0`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]