[ https://issues.apache.org/jira/browse/HUDI-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lietong Liu reassigned HUDI-1667: --------------------------------- Assignee: Lietong Liu > Fix bug when HoodieMergeOnReadRDD read record from base file, Hoodie may set > non-null value in field which is null if vectorization is enabled. > ----------------------------------------------------------------------------------------------------------------------------------------------- > > Key: HUDI-1667 > URL: https://issues.apache.org/jira/browse/HUDI-1667 > Project: Apache Hudi > Issue Type: Bug > Components: Common Core > Reporter: Lietong Liu > Assignee: Lietong Liu > Priority: Major > Labels: pull-request-available > Fix For: 0.6.0 > > > When HoodieMergeOnReadRDD read record from base file, will create new > InternalRow base on requiredStructSchema. > {code:java} > //代码占位符 > private def createRowWithRequiredSchema(row: InternalRow): InternalRow = { > val rowToReturn = new SpecificInternalRow(tableState.requiredStructSchema) > val posIterator = requiredFieldPosition.iterator > var curIndex = 0 > tableState.requiredStructSchema.foreach( > f => { > val curPos = posIterator.next() > val curField = row.get(curPos, f.dataType) > rowToReturn.update(curIndex, curField) > curIndex = curIndex + 1 > } > ) > rowToReturn > } > {code} > Hoodie doesn't check isNull when get value from all fields here. > If vectorization is enabled, which means row is *ColumnarBatchRow*_*.*_ > ***ColumnarBatchRow* may return non-null value even if value of field is > null. So, hoodie may set non-null value in field which is null. -- This message was sent by Atlassian Jira (v8.3.4#803005)