jonvex commented on code in PR #9338:
URL: https://github.com/apache/hudi/pull/9338#discussion_r1282347510


##########
website/docs/migration_guide.md:
##########
@@ -56,11 +64,13 @@ spark-submit --master local \
 --hoodie-conf 
hoodie.bootstrap.keygen.class=org.apache.hudi.keygen.SimpleKeyGenerator \
 --hoodie-conf 
hoodie.bootstrap.full.input.provider=org.apache.hudi.bootstrap.SparkParquetBootstrapDataProvider
 \

Review Comment:
   I don't think we need `hoodie-conf hoodie.bootstrap.full.input.provider` in 
the example



##########
website/docs/migration_guide.md:
##########
@@ -69,12 +79,28 @@ for partition in [list of partitions in source table] {
 }
 ```  
 
-**Option 3**
+**Option 3 using Spark SQL CALL Procedure**
+
+Refer to [Bootstrap 
procedure](https://hudi.apache.org/docs/next/procedures#bootstrap) for more 
details. 
+
+**Option 4 using Hudi CLI**
+
 Write your own custom logic of how to load an existing table into a Hudi 
managed one. Please read about the RDD API
 [here](/docs/quick-start-guide). Using the bootstrap run CLI. Once hudi has 
been built via `mvn clean install -DskipTests`, the shell can be
 fired by via `cd hudi-cli && ./hudi-cli.sh`.
 
 ```java
 hudi->bootstrap run --srcPath /tmp/source_table --targetPath 
/tmp/hoodie/bootstrap_table --tableName bootstrap_table --tableType 
COPY_ON_WRITE --rowKeyField ${KEY_FIELD} --partitionPathField 
${PARTITION_FIELD} --sparkMaster local --hoodieConfigs 
hoodie.datasource.write.hive_style_partitioning=true --selectorClass 
org.apache.hudi.client.bootstrap.selector.FullRecordBootstrapModeSelector
 ```
-Unlike deltaStream, FULL_RECORD or METADATA_ONLY is set with --selectorClass, 
see detalis with help "bootstrap run".
+Unlike Hudi Streamer, FULL_RECORD or METADATA_ONLY is set with 
--selectorClass, see details with help "bootstrap run".
+
+
+## Configs
+
+Here are the basic configs that control bootstrapping.
+
+| Config Name                                          | Default            | 
Description                                                                     
                                                        |
+| ---------------------------------------------------- | ------------------ | 
---------------------------------------------------------------------------------------------------------------------------------------
 |
+| hoodie.bootstrap.base.path | N/A **(Required)** | Base path of the dataset 
that needs to be bootstrapped as a Hudi table<br /><br />`Config Param: 
BASE_PATH`<br />`Since Version: 0.6.0` |
+
+By default, with only `hoodie.bootstrap.base.path` being provided 
METADATA_ONLY mode is selected. For other options, please refer [bootstrap 
configs](https://hudi.apache.org/docs/next/configurations#Bootstrap-Configs) 
for more details.

Review Comment:
   I think adding `hoodie.bootstrap.mode.selector.regex.mode`, 
`hoodie.bootstrap.mode.selector`, `hoodie.bootstrap.mode.selector.regex` to the 
simple configs would be helpful. At a minimum at least 
`hoodie.bootstrap.mode.selector` should be added



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to