Himanshu-g81 commented on code in PR #6435:
URL: https://github.com/apache/hbase/pull/6435#discussion_r1835317085
##########
hbase-mapreduce/src/test/java/org/apache/hadoop/hbase/mapreduce/TestRowCounter.java:
##########
@@ -524,6 +526,106 @@ public void testInvalidTable() throws Exception {
}
}
+ /**
+ * Step 1: Add 6 rows(row1, row2, row3, row4, row5 and row6) to a table.
Each row contains 1
+ * column family and 4 columns. Step 2: Delete a column for row1. Step 3:
Delete a column family
+ * for row2 and row4. Step 4: Delete all versions of a specific column for
row3, row5 and row6.
+ * <p>
+ * Case 1: Run row counter without countDeleteMarkers flag Step a: Validate
counter values.
+ * <p>
+ * Case 2: Run row counter with countDeleteMarkers flag Step a: Validate
counter values.
+ */
+ @Test
+ public void testRowCounterWithCountDeleteMarkersOption() throws Exception {
+ // Test Setup
+
+ final TableName tableName =
+ TableName.valueOf(TABLE_NAME + "_" + "withCountDeleteMarkersOption");
+ final byte[][] rowKeys = { Bytes.toBytes("row1"), Bytes.toBytes("row2"),
Bytes.toBytes("row3"),
+ Bytes.toBytes("row4"), Bytes.toBytes("row5"), Bytes.toBytes("row6") };
+ final byte[] columnFamily = Bytes.toBytes("cf");
+ final byte[][] columns =
+ { Bytes.toBytes("A"), Bytes.toBytes("B"), Bytes.toBytes("C"),
Bytes.toBytes("D") };
+ final byte[] value = Bytes.toBytes("a");
+
+ try (Table table = TEST_UTIL.createTable(tableName, columnFamily)) {
+ // Step 1: Insert rows with columns
+ for (byte[] rowKey : rowKeys) {
+ Put put = new Put(rowKey);
+ for (byte[] col : columns) {
+ put.addColumn(columnFamily, col, value);
+ }
+ table.put(put);
+ }
+ TEST_UTIL.getAdmin().flush(tableName);
+
+ // Steps 2, 3, and 4: Delete columns, families, and all versions of
columns
Review Comment:
Test coverage could be improved:
1. Add a row for which no type of delete marker is issued.
2. Add a column delete marker for same row but different columns.
3. Also generate few additional rows (including delete markers) - and do the
row counter on range of keys - which do not include these rows.
##########
hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/RowCounter.java:
##########
@@ -105,9 +158,11 @@ public void map(ImmutableBytesWritable row, Result values,
Context context) thro
* @throws IOException When setting up the job fails.
*/
public Job createSubmittableJob(Configuration conf) throws IOException {
+ conf.setBoolean(OPT_COUNT_DELETE_MARKERS, this.countDeleteMarkers);
Review Comment:
Or can be done while parsing the input (similar to how it's done for
EXPECTED_COUNT_KEY
[here](https://github.com/apache/hbase/blob/3fbe4fbb68e33f6ed796d268961ae05e19307f65/hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/RowCounter.java#L177-L181))
##########
hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/RowCounter.java:
##########
@@ -65,22 +67,42 @@ public class RowCounter extends AbstractHBaseTool {
private final static String OPT_END_TIME = "endtime";
private final static String OPT_RANGE = "range";
private final static String OPT_EXPECTED_COUNT = "expectedCount";
+ private final static String OPT_COUNT_DELETE_MARKERS = "countDeleteMarkers";
private String tableName;
private List<MultiRowRangeFilter.RowRange> rowRangeList;
private long startTime;
private long endTime;
private long expectedCount;
+ private boolean countDeleteMarkers;
private List<String> columns = new ArrayList<>();
+ private Job job;
+
/**
* Mapper that runs the count.
*/
static class RowCounterMapper extends TableMapper<ImmutableBytesWritable,
Result> {
- /** Counter enumeration to count the actual rows. */
+ /** Counter enumeration to count the actual rows, cells and delete
markers. */
public static enum Counters {
- ROWS
+ ROWS,
+ CELLS,
+ DELETE,
+ DELETE_COLUMN,
+ DELETE_FAMILY,
+ DELETE_FAMILY_VERSION,
+ ROWS_WITH_DELETE_MARKER
+ }
+
+ private boolean countDeleteMarkers;
Review Comment:
nit: for code sanity, declare this with other class variables (parameters)
[here](https://github.com/apache/hbase/blob/0e69048a4fb6f32a7b9facac94ed44e2cfa5a022/hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/RowCounter.java#L72-L78).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]