Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21320#discussion_r203934281 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetReadSupport.scala --- @@ -47,16 +47,25 @@ import org.apache.spark.sql.types._ * * Due to this reason, we no longer rely on [[ReadContext]] to pass requested schema from [[init()]] * to [[prepareForRead()]], but use a private `var` for simplicity. + * + * @param parquetMrCompatibility support reading with parquet-mr or Spark's built-in Parquet reader */ -private[parquet] class ParquetReadSupport(val convertTz: Option[TimeZone]) +private[parquet] class ParquetReadSupport(val convertTz: Option[TimeZone], + parquetMrCompatibility: Boolean) extends ReadSupport[UnsafeRow] with Logging { private var catalystRequestedSchema: StructType = _ + /** + * Construct a [[ParquetReadSupport]] with [[convertTz]] set to [[None]] and + * [[parquetMrCompatibility]] set to [[false]]. + * + * We need a zero-arg constructor for SpecificParquetRecordReaderBase. But that is only + * used in the vectorized reader, where we get the convertTz value directly, and the value here + * is ignored. Further, we set [[parquetMrCompatibility]] to [[false]] as this constructor is only + * called by the Spark reader. --- End diff -- re: https://github.com/apache/spark/pull/21320/files/cb858f202e49d69f2044681e37f982dc10676296#r199631341 actually, it doesn't looks clear to me too. What does the flag indicate? you mean normal parquet reader vs vectorized parquet reader?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org