Github user viirya commented on the issue: https://github.com/apache/spark/pull/21952 @dbtsai I didn't use Spark 2.3 when testing databricks-avro. I also used current master. But because a recent change of schema verifying (`FileFormat.supportDataType`) causes incompatibility, I manually skip this call to `supportDataType`. So basically I tested built-in avro and databricks-avro both on current master. I think the difference between Spark 2.3 and current master may cause difference. Btw, in the following benchmark numbers I modify array feature length from 16000 to 1600. ```scala > "com.databricks.spark.avro" scala> spark.sparkContext.parallelize(writeTimes.slice(50, 150)).toDF("writeTimes").describe("writeTimes").show() +-------+--------------------+ |summary| writeTimes| +-------+--------------------+ | count| 100| | mean| 0.21102| | stddev|0.010737435692590912| | min| 0.195| | max| 0.247| +-------+--------------------+ scala> spark.sparkContext.parallelize(readTimes.slice(50, 150)).toDF("readTimes").describe("readTimes").show() +-------+--------------------+ |summary| readTimes| +-------+--------------------+ | count| 100| | mean| 0.09441999999999999| | stddev|0.016021563751722395| | min| 0.07| | max| 0.134| +-------+--------------------+ > "avro" scala> spark.sparkContext.parallelize(writeTimes.slice(50, 150)).toDF("writeTimes").describe("writeTimes").show() +-------+--------------------+ |summary| writeTimes| +-------+--------------------+ | count| 100| | mean| 0.21445| | stddev|0.008952596824329237| | min| 0.201| | max| 0.25| +-------+--------------------+ scala> spark.sparkContext.parallelize(readTimes.slice(50, 150)).toDF("readTimes").describe("readTimes").show() +-------+--------------------+ |summary| readTimes| +-------+--------------------+ | count| 100| | mean| 0.10792| | stddev|0.015983375201386058| | min| 0.08| | max| 0.15| +-------+--------------------+ ```
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org