This is an automated email from the ASF dual-hosted git repository.

danny0405 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/master by this push:
     new 6334bf19569 [HUDI-6791] Make some comments look better (#11854)
6334bf19569 is described below

commit 6334bf19569fadd5f54bf26b7c8f33a0a2bcca67
Author: Lin Liu <141371752+linliu-c...@users.noreply.github.com>
AuthorDate: Wed Aug 28 18:34:21 2024 -0700

    [HUDI-6791] Make some comments look better (#11854)
---
 .../common/table/cdc/HoodieCDCInferenceCase.java   | 47 +++++++++++-----------
 1 file changed, 23 insertions(+), 24 deletions(-)

diff --git 
a/hudi-common/src/main/java/org/apache/hudi/common/table/cdc/HoodieCDCInferenceCase.java
 
b/hudi-common/src/main/java/org/apache/hudi/common/table/cdc/HoodieCDCInferenceCase.java
index ed2a1c4c185..6722860ad8e 100644
--- 
a/hudi-common/src/main/java/org/apache/hudi/common/table/cdc/HoodieCDCInferenceCase.java
+++ 
b/hudi-common/src/main/java/org/apache/hudi/common/table/cdc/HoodieCDCInferenceCase.java
@@ -24,18 +24,19 @@ package org.apache.hudi.common.table.cdc;
  *
  * AS_IS:
  *   For this type, there must be a real cdc log file from which we get the 
whole/part change data.
- *   when `hoodie.table.cdc.supplemental.logging.mode` is {@link 
HoodieCDCSupplementalLoggingMode#DATA_BEFORE_AFTER}, it keeps all the fields 
about the
- *   change data, including `op`, `ts_ms`, `before` and `after`. So read it 
and return directly,
- *   no more other files need to be loaded.
- *   when `hoodie.table.cdc.supplemental.logging.mode` is {@link 
HoodieCDCSupplementalLoggingMode#DATA_BEFORE}, it keeps the `op`, the key and 
the
- *   `before` of the changing record. When `op` is equal to 'i' or 'u', need 
to get the current record from the
- *   current base/log file as `after`.
- *   when `hoodie.table.cdc.supplemental.logging.mode` is 'op_key', it just 
keeps the `op` and the key of
- *   the changing record. When `op` is equal to 'i', `before` is null and get 
the current record
- *   from the current base/log file as `after`. When `op` is equal to 'u', get 
the previous
- *   record from the previous file slice as `before`, and get the current 
record from the
- *   current base/log file as `after`. When `op` is equal to 'd', get the 
previous record from
- *   the previous file slice as `before`, and `after` is null.
+ *   When `hoodie.table.cdc.supplemental.logging.mode` is {@link 
HoodieCDCSupplementalLoggingMode#DATA_BEFORE_AFTER},
+ *     it keeps all the fields about the change data, including `op`, `ts_ms`, 
`before` and `after`.
+ *     So read it and return directly, no more other files need to be loaded.
+ *   When `hoodie.table.cdc.supplemental.logging.mode` is {@link 
HoodieCDCSupplementalLoggingMode#DATA_BEFORE},
+ *     it keeps the `op`, the key and the `before` of the changing record.
+ *     When `op` is equal to 'i' or 'u', need to get the current record from 
the current base/log file as `after`.
+ *   When `hoodie.table.cdc.supplemental.logging.mode` is '{@link 
HoodieCDCSupplementalLoggingMode#OP_KEY_ONLY',
+ *     it just keeps the `op` and the key of the changing record.
+ *     When `op` is equal to 'i', `before` is null and get the current record
+ *     from the current base/log file as `after`.
+ *     When `op` is equal to 'u', get the previous record from the previous 
file slice as `before`,
+ *     and get the current record from the current base/log file as `after`.
+ *     When `op` is equal to 'd', get the previous record from the previous 
file slice as `before`, and `after` is null.
  *
  * BASE_FILE_INSERT:
  *   For this type, there must be a base file at the current instant. All the 
records from this
@@ -49,18 +50,16 @@ package org.apache.hudi.common.table.cdc;
  *   the value of `before`. The value of `after` for each record is null.
  *
  * LOG_FILE:
- *   For this type, a normal log file of mor table will be used. First we need 
to load the previous
- *   file slice(including the base file and other log files in the same file 
group). Then for each
- *   record from the log file, get the key of this, and execute the following 
steps:
- *     1) if the record is deleted,
- *       a) if there is a record with the same key in the data loaded, `op` is 
'd', 'before' is the
- *          record from the data loaded, `after` is null;
- *       b) if there is not a record with the same key in the data loaded, 
just skip.
- *     2) the record is not deleted,
- *       a) if there is a record with the same key in the data loaded, `op` is 
'u', 'before' is the
- *          record from the data loaded, `after` is the current record;
- *       b) if there is not a record with the same key in the data loaded, 
`op` is 'i', 'before' is
- *          null, `after` is the current record;
+ *   For this type, a normal log file of MOR table will be used. First we need 
to load the previous
+ *   file slice (including the base file and other log files in the same file 
group). Then for each
+ *   record (called `current record` hereafter) from the log file, get its 
key, and execute the following steps:
+ *     1) if the current record is deleted,
+ *       a) if there is a record with the same key in the data loaded (called 
`loaded record` hereafter),
+ *          `op` is 'd', 'before' is the loaded record, `after` is null;
+ *       b) if the loaded reocrd does not exist, just skip.
+ *     2) the current record is not deleted,
+ *       a) if there is a loaded record, `op` is 'u', 'before' is the loaded 
record, `after` is the current record;
+ *       b) if the loaded record does not exist, `op` is 'i', 'before' is 
null, `after` is the current record;
  *
  * REPLACE_COMMIT:
  *   For this type, it must be a replacecommit, like INSERT_OVERWRITE and 
DROP_PARTITION. It drops

Reply via email to