[ https://issues.apache.org/jira/browse/HUDI-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Raymond Xu updated HUDI-2390: ----------------------------- Issue Type: Improvement (was: Bug) > Create table by hudisql,write data into table by datasource,hudi delete cmd > can not delete data > ----------------------------------------------------------------------------------------------- > > Key: HUDI-2390 > URL: https://issues.apache.org/jira/browse/HUDI-2390 > Project: Apache Hudi > Issue Type: Improvement > Components: Spark Integration > Affects Versions: 0.9.0 > Reporter: renhao > Priority: Minor > Labels: features > > Test Case: > {code:java} > import org.apache.hudi.QuickstartUtils._ > import scala.collection.JavaConversions._ > import org.apache.spark.sql.SaveMode._ > import org.apache.hudi.DataSourceReadOptions._ > import org.apache.hudi.DataSourceWriteOptions._ > import org.apache.hudi.config.HoodieWriteConfig._{code} > 1.准备数据 > > {code:java} > spark.sql("create table test1(a int,b string,c string) using hudi partitioned > by(b) options(primaryKey='a')") > spark.sql("insert into table test1 select 1,2,3") > {code} > > 2.创建hudi table test2 > {code:java} > spark.sql("create table test2(a int,b string,c string) using hudi partitioned > by(b) options(primaryKey='a')"){code} > 3.datasource向test2写入数据 > > {code:java} > val base_data=spark.sql("select * from testdb.test1") > base_data.write.format("hudi"). > option(TABLE_TYPE_OPT_KEY, COW_TABLE_TYPE_OPT_VAL). > option(RECORDKEY_FIELD_OPT_KEY, "a"). > option(PARTITIONPATH_FIELD_OPT_KEY, "b"). > option(KEYGENERATOR_CLASS_OPT_KEY, > "org.apache.hudi.keygen.SimpleKeyGenerator"). > option(OPERATION_OPT_KEY, "bulk_insert"). > option(HIVE_SYNC_ENABLED_OPT_KEY, "true"). > option(HIVE_PARTITION_FIELDS_OPT_KEY, "b"). > option(HIVE_PARTITION_EXTRACTOR_CLASS_OPT_KEY,"org.apache.hudi.hive.MultiPartKeysValueExtractor"). > > option(HIVE_DATABASE_OPT_KEY, "testdb"). > option(HIVE_TABLE_OPT_KEY, "test2"). > option(HIVE_USE_JDBC_OPT_KEY, "true"). > option("hoodie.bulkinsert.shuffle.parallelism", 4). > option("hoodie.datasource.write.hive_style_partitioning", "true"). > option(TABLE_NAME, > "test2").mode(Append).save(s"/user/hive/warehouse/testdb.db/test2") > {code} > > 此时执行查询结果如下: > {code:java} > +---+---+---+ > | a| b| c| > +---+---+---+ > | 1| 3| 2| > +---+---+---+{code} > 4.删除一条记录 > {code:java} > spark.sql("delete from testdb.test2 where a=1"){code} > 5.执行查询,a=1的记录未被删除 > {code:java} > spark.sql("select a,b,c from testdb.test2").show{code} > {code:java} > +---+---+---+ > | a| b| c| > +---+---+---+ > | 1| 3| 2| > +---+---+---+{code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)