[ https://issues.apache.org/jira/browse/HUDI-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
lvyanquan updated HUDI-6041: ---------------------------- Description: We need to write extra properties to a HDFS file for [Bootstrap Procedure|https://hudi.apache.org/docs/next/procedures#run_bootstrap] and set `props_file_path`, which make it troublesome to call this procedure, like: {code:java} call run_bootstrap(table => 'test_hudi_table', table_type => 'COPY_ON_WRITE', bootstrap_path => 'hdfs://ns1/hive/warehouse/hudi.db/test_hudi_table', base_path => 'hdfs://ns1//tmp/hoodie/test_hudi_table', rowKey_field => 'id', partition_path_field => 'dt', props_file_path => 'hdfs://ns1//tmp/tableProp.txt'); {code} Or we can set those properties by session config, which means that we need to execute some `set` SQLs. We can add a new parameter for procedure input named `properties`, add collect key-value pairs for this input, like: {code:java} call run_bootstrap(table => 'test_hudi_table', table_type => 'COPY_ON_WRITE', bootstrap_path => 'hdfs://ns1/hive/warehouse/hudi.db/test_hudi_table', base_path => 'hdfs://ns1//tmp/hoodie/test_hudi_table', rowKey_field => 'id', partition_path_field => 'dt', properties => 'hoodie.datasource.write.hive_style_partitioning=true'); {code} So that we don't need to put another file to HDFS was:we need to write properties to a hdfs file for [Bootstrap Procedure|https://hudi.apache.org/docs/next/procedures#run_bootstrap] and set `props_file_path` , it's troublesome to call this procedure. > add `properties` to Hudi Spark Procedures > ----------------------------------------- > > Key: HUDI-6041 > URL: https://issues.apache.org/jira/browse/HUDI-6041 > Project: Apache Hudi > Issue Type: Improvement > Components: bootstrap, spark-sql > Reporter: lvyanquan > Priority: Major > > We need to write extra properties to a HDFS file for [Bootstrap > Procedure|https://hudi.apache.org/docs/next/procedures#run_bootstrap] and set > `props_file_path`, which make it troublesome to call this procedure, like: > {code:java} > call run_bootstrap(table => 'test_hudi_table', table_type => 'COPY_ON_WRITE', > bootstrap_path => 'hdfs://ns1/hive/warehouse/hudi.db/test_hudi_table', > base_path => 'hdfs://ns1//tmp/hoodie/test_hudi_table', > rowKey_field => 'id', partition_path_field => 'dt', > props_file_path => 'hdfs://ns1//tmp/tableProp.txt'); {code} > Or we can set those properties by session config, which means that we need to > execute some `set` SQLs. > We can add a new parameter for procedure input named `properties`, add > collect key-value pairs for this input, like: > {code:java} > call run_bootstrap(table => 'test_hudi_table', table_type => 'COPY_ON_WRITE', > bootstrap_path => 'hdfs://ns1/hive/warehouse/hudi.db/test_hudi_table', > base_path => 'hdfs://ns1//tmp/hoodie/test_hudi_table', > rowKey_field => 'id', partition_path_field => 'dt', > properties => 'hoodie.datasource.write.hive_style_partitioning=true'); {code} > So that we don't need to put another file to HDFS -- This message was sent by Atlassian Jira (v8.20.10#820010)