This is an automated email from the ASF dual-hosted git repository.

bhavanisudha pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new eb6a998fd85 updated delete to mention duplicates- and did some writing 
cleanup (#10659)
eb6a998fd85 is described below

commit eb6a998fd85deaf7fef551f74ea70b0f08cffe22
Author: nadine farah <nfara...@gmail.com>
AuthorDate: Fri Feb 23 16:16:27 2024 -0800

    updated delete to mention duplicates- and did some writing cleanup (#10659)
---
 website/docs/write_operations.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/website/docs/write_operations.md b/website/docs/write_operations.md
index 90b87499fe0..3146db05802 100644
--- a/website/docs/write_operations.md
+++ b/website/docs/write_operations.md
@@ -29,7 +29,7 @@ of initial load. However, this just does a best-effort job at 
sizing files vs gu
 Hudi supports implementing two types of deletes on data stored in Hudi tables, 
by enabling the user to specify a different record payload implementation.
 - **Soft Deletes** : Retain the record key and just null out the values for 
all the other fields.
   This can be achieved by ensuring the appropriate fields are nullable in the 
table schema and simply upserting the table after setting these fields to null.
-- **Hard Deletes** : A stronger form of deletion is to physically remove any 
trace of the record from the table. This can be achieved in 3 different ways. 
+- **Hard Deletes** : This method entails completely eradicating all evidence 
of a record from the table, including any duplicates. There are three distinct 
approaches to accomplish this: 
   - Using DataSource, set `OPERATION_OPT_KEY` to `DELETE_OPERATION_OPT_VAL`. 
This will remove all the records in the DataSet being submitted. 
   - Using DataSource, set `PAYLOAD_CLASS_OPT_KEY` to 
`"org.apache.hudi.EmptyHoodieRecordPayload"`. This will remove all the records 
in the DataSet being submitted. 
   - Using DataSource or Hudi Streamer, add a column named `_hoodie_is_deleted` 
to DataSet. The value of this column must be set to `true` for all the records 
to be deleted and either `false` or left null for any records which are to be 
upserted.

Reply via email to