LuciferYang commented on PR #46447:
URL: https://github.com/apache/spark/pull/46447#issuecomment-2175086631

   > val maxDef = if (inputValues.contains(null)) 1 else 0 
   >  val ty = parquetSchema.asGroupType().getType("a").asPrimitiveType() 
   >  val cd = new ColumnDescriptor(Seq("a").toArray, ty, 0, maxDef) 
   >  val repetitionLevels = Array.fill[Int](inputValues.length)(0) 
   >  val definitionLevels = inputValues.map(v => if (v == null) 0 else 1)
   
   @wgtmac Thank you for your explanation, it seems you are correct, should 
Line 505 be changed from 
   
   ```scala
    val definitionLevels = inputValues.map(v => if (v == null) 0 else 1) 
   ```
   to 
   
   ```scala
   val definitionLevels = inputValues.map(v => if (v == null) 0 else maxDef)
   ```
   
   ? I manually tested it, and this way `ParquetVectorizedSuite` can pass.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to