[ https://issues.apache.org/jira/browse/HUDI-2250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Udit Mehrotra updated HUDI-2250: -------------------------------- Fix Version/s: (was: 0.9.0) 0.10.0 > [SQL] Bulk insert support for tables w/ primary key > --------------------------------------------------- > > Key: HUDI-2250 > URL: https://issues.apache.org/jira/browse/HUDI-2250 > Project: Apache Hudi > Issue Type: Sub-task > Reporter: sivabalan narayanan > Priority: Blocker > Labels: release-blocker > Fix For: 0.10.0 > > > we want to support bulk insert for any table. Right now, we have a constraint > that only tables w/o any primary key can be bulk_inserted. > > > > > set hoodie.sql.bulk.insert.enable = true; > hoodie.sql.bulk.insert.enable true > Time taken: 2.019 seconds, Fetched 1 row(s) > spark-sql> set hoodie.datasource.write.row.writer.enable = true; > hoodie.datasource.write.row.writer.enable true > Time taken: 0.026 seconds, Fetched 1 row(s) > spark-sql> > > > > create table hudi_17Gb_ext1 using hudi location > 's3a://siva-test-bucket-june-16/hudi_testing/gh_arch_dump/hudi_5/' options ( > > type = 'cow', > > primaryKey = 'randomId', > > preCombineField = 'date_col' > > ) > > partitioned by (type) as select * from gh_17Gb_date_col; > 21/07/29 04:26:15 ERROR SparkSQLDriver: Failed in [create table > hudi_17Gb_ext1 using hudi location > 's3a://siva-test-bucket-june-16/hudi_testing/gh_arch_dump/hudi_5/' options ( > type = 'cow', > primaryKey = 'randomId', > preCombineField = 'date_col' > ) > partitioned by (type) as select * from gh_17Gb_date_col] > java.lang.IllegalArgumentException: Table with primaryKey can not use bulk > insert. > at > org.apache.spark.sql.hudi.command.InsertIntoHoodieTableCommand$.buildHoodieInsertConfig(InsertIntoHoodieTableCommand.scala:219) > at > org.apache.spark.sql.hudi.command.InsertIntoHoodieTableCommand$.run(InsertIntoHoodieTableCommand.scala:78) > at > org.apache.spark.sql.hudi.command.CreateHoodieTableAsSelectCommand.run(CreateHoodieTableAsSelectCommand.scala:86) > at > org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:108) > at > org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:106) > at > org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:120) -- This message was sent by Atlassian Jira (v8.3.4#803005)