[ https://issues.apache.org/jira/browse/HUDI-7480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vinaykumar Bhat resolved HUDI-7480. ----------------------------------- > initializeFunctionalIndexPartition is called multiple times > ----------------------------------------------------------- > > Key: HUDI-7480 > URL: https://issues.apache.org/jira/browse/HUDI-7480 > Project: Apache Hudi > Issue Type: Bug > Reporter: Vinaykumar Bhat > Assignee: Vinaykumar Bhat > Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > > This is due to a issue in > initializeFromFilesystem(), which tries to check if MDT partition needs to be > initialized based on the absence of partition-type. But for functional index, > partition-type actually store the prefix (func_index_)- hence the check > always fails and we try to reinit the same functional index partition again. > > Simple test: > {quote}spark.sql( > s""" > |create table $tableName ( > | id int, > | name string, > | price double, > | ts long > |) using hudi > | options ( > | primaryKey ='id', > | type = '$tableType', > | preCombineField = 'ts', > | hoodie.metadata.record.index.enable = 'true', > | hoodie.datasource.write.recordkey.field = 'id' > | ) > | partitioned by(ts) > | location '$basePath' > """.stripMargin) > spark.sql(s"insert into $tableName values(1, 'a1', 10, 1000)") > spark.sql(s"insert into $tableName values(2, 'a2', 10, 1001)") > spark.sql(s"insert into $tableName values(3, 'a3', 10, 1002)") > > var createIndexSql = s"create index idx_datestr on $tableName using > column_stats(ts) options(func='from_unixtime', format='yyyy-MM-dd')" > spark.sql(createIndexSql) > > -- This insert throws null-pointer exception > spark.sql(s"insert into $tableName values(4, 'a4', 10, 1004)"){quote} -- This message was sent by Atlassian Jira (v8.20.10#820010)