razajafri commented on pull request #31284: URL: https://github.com/apache/spark/pull/31284#issuecomment-768517258
@revans2 I ran a test manually with two files with 1M records written with Spark 3.0.0. They were read in with Spark-3.0.0, Spark-3.1 and with master with my fix. Each file was read in 3 times, I used spark.time to time the read which isn't the best way I know but still gives us a ball park number File 1 contains rows of [Decimal(18,0), Decimal(7,3), Decimal(7,7), Decimal(12,2)]. Read times avg ms are spark-3.0: 3960 ms spark-3.1: 4262 ms spark-master-with-fix: 4129 ms File 2 contains rows of [Decimal(12,2)] spark-3.0: 683 ms spark-3.1: 668 ms spark-master-with-fix: 638 ms I don't know if/how we can automate a unit test for this. Let me know what you think ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org