LuciferYang commented on PR #46447: URL: https://github.com/apache/spark/pull/46447#issuecomment-2175086631
> val maxDef = if (inputValues.contains(null)) 1 else 0 > val ty = parquetSchema.asGroupType().getType("a").asPrimitiveType() > val cd = new ColumnDescriptor(Seq("a").toArray, ty, 0, maxDef) > val repetitionLevels = Array.fill[Int](inputValues.length)(0) > val definitionLevels = inputValues.map(v => if (v == null) 0 else 1) @wgtmac Thank you for your explanation, it seems you are correct, should Line 505 be changed from ```scala val definitionLevels = inputValues.map(v => if (v == null) 0 else 1) ``` to ```scala val definitionLevels = inputValues.map(v => if (v == null) 0 else maxDef) ``` ? I manually tested it, and this way `ParquetVectorizedSuite` can pass. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org