This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git
The following commit(s) were added to refs/heads/master by this push: new 6334bf19569 [HUDI-6791] Make some comments look better (#11854) 6334bf19569 is described below commit 6334bf19569fadd5f54bf26b7c8f33a0a2bcca67 Author: Lin Liu <141371752+linliu-c...@users.noreply.github.com> AuthorDate: Wed Aug 28 18:34:21 2024 -0700 [HUDI-6791] Make some comments look better (#11854) --- .../common/table/cdc/HoodieCDCInferenceCase.java | 47 +++++++++++----------- 1 file changed, 23 insertions(+), 24 deletions(-) diff --git a/hudi-common/src/main/java/org/apache/hudi/common/table/cdc/HoodieCDCInferenceCase.java b/hudi-common/src/main/java/org/apache/hudi/common/table/cdc/HoodieCDCInferenceCase.java index ed2a1c4c185..6722860ad8e 100644 --- a/hudi-common/src/main/java/org/apache/hudi/common/table/cdc/HoodieCDCInferenceCase.java +++ b/hudi-common/src/main/java/org/apache/hudi/common/table/cdc/HoodieCDCInferenceCase.java @@ -24,18 +24,19 @@ package org.apache.hudi.common.table.cdc; * * AS_IS: * For this type, there must be a real cdc log file from which we get the whole/part change data. - * when `hoodie.table.cdc.supplemental.logging.mode` is {@link HoodieCDCSupplementalLoggingMode#DATA_BEFORE_AFTER}, it keeps all the fields about the - * change data, including `op`, `ts_ms`, `before` and `after`. So read it and return directly, - * no more other files need to be loaded. - * when `hoodie.table.cdc.supplemental.logging.mode` is {@link HoodieCDCSupplementalLoggingMode#DATA_BEFORE}, it keeps the `op`, the key and the - * `before` of the changing record. When `op` is equal to 'i' or 'u', need to get the current record from the - * current base/log file as `after`. - * when `hoodie.table.cdc.supplemental.logging.mode` is 'op_key', it just keeps the `op` and the key of - * the changing record. When `op` is equal to 'i', `before` is null and get the current record - * from the current base/log file as `after`. When `op` is equal to 'u', get the previous - * record from the previous file slice as `before`, and get the current record from the - * current base/log file as `after`. When `op` is equal to 'd', get the previous record from - * the previous file slice as `before`, and `after` is null. + * When `hoodie.table.cdc.supplemental.logging.mode` is {@link HoodieCDCSupplementalLoggingMode#DATA_BEFORE_AFTER}, + * it keeps all the fields about the change data, including `op`, `ts_ms`, `before` and `after`. + * So read it and return directly, no more other files need to be loaded. + * When `hoodie.table.cdc.supplemental.logging.mode` is {@link HoodieCDCSupplementalLoggingMode#DATA_BEFORE}, + * it keeps the `op`, the key and the `before` of the changing record. + * When `op` is equal to 'i' or 'u', need to get the current record from the current base/log file as `after`. + * When `hoodie.table.cdc.supplemental.logging.mode` is '{@link HoodieCDCSupplementalLoggingMode#OP_KEY_ONLY', + * it just keeps the `op` and the key of the changing record. + * When `op` is equal to 'i', `before` is null and get the current record + * from the current base/log file as `after`. + * When `op` is equal to 'u', get the previous record from the previous file slice as `before`, + * and get the current record from the current base/log file as `after`. + * When `op` is equal to 'd', get the previous record from the previous file slice as `before`, and `after` is null. * * BASE_FILE_INSERT: * For this type, there must be a base file at the current instant. All the records from this @@ -49,18 +50,16 @@ package org.apache.hudi.common.table.cdc; * the value of `before`. The value of `after` for each record is null. * * LOG_FILE: - * For this type, a normal log file of mor table will be used. First we need to load the previous - * file slice(including the base file and other log files in the same file group). Then for each - * record from the log file, get the key of this, and execute the following steps: - * 1) if the record is deleted, - * a) if there is a record with the same key in the data loaded, `op` is 'd', 'before' is the - * record from the data loaded, `after` is null; - * b) if there is not a record with the same key in the data loaded, just skip. - * 2) the record is not deleted, - * a) if there is a record with the same key in the data loaded, `op` is 'u', 'before' is the - * record from the data loaded, `after` is the current record; - * b) if there is not a record with the same key in the data loaded, `op` is 'i', 'before' is - * null, `after` is the current record; + * For this type, a normal log file of MOR table will be used. First we need to load the previous + * file slice (including the base file and other log files in the same file group). Then for each + * record (called `current record` hereafter) from the log file, get its key, and execute the following steps: + * 1) if the current record is deleted, + * a) if there is a record with the same key in the data loaded (called `loaded record` hereafter), + * `op` is 'd', 'before' is the loaded record, `after` is null; + * b) if the loaded reocrd does not exist, just skip. + * 2) the current record is not deleted, + * a) if there is a loaded record, `op` is 'u', 'before' is the loaded record, `after` is the current record; + * b) if the loaded record does not exist, `op` is 'i', 'before' is null, `after` is the current record; * * REPLACE_COMMIT: * For this type, it must be a replacecommit, like INSERT_OVERWRITE and DROP_PARTITION. It drops