steveloughran commented on PR #3452: URL: https://github.com/apache/parquet-java/pull/3452#issuecomment-4299734733
latest numbers before adding uuid read (but with that long column) ``` Benchmark (tableType) Mode Cnt Score Error Units VariantProjectionBenchmark.readAllRecords Unshredded ss 10 1645.855 ± 27.618 ms/op VariantProjectionBenchmark.readAllRecords Shredded ss 10 2381.192 ± 41.940 ms/op VariantProjectionBenchmark.readProjectedFileSchema Unshredded ss 10 932.050 ± 37.143 ms/op VariantProjectionBenchmark.readProjectedFileSchema Shredded ss 10 1596.800 ± 50.421 ms/op VariantProjectionBenchmark.readProjectedLeanSchema Unshredded ss 10 1750.998 ± 10.982 ms/op VariantProjectionBenchmark.readProjectedLeanSchema Shredded ss 10 724.603 ± 18.377 ms/op ``` after adding UUID variant field. ``` Benchmark (tableType) Mode Cnt Score Error Units VariantProjectionBenchmark.readAllRecords Unshredded ss 10 1913.765 ± 18.896 ms/op VariantProjectionBenchmark.readAllRecords Shredded ss 10 2679.631 ± 150.978 ms/op VariantProjectionBenchmark.readProjectedFileSchema Unshredded ss 10 910.009 ± 33.074 ms/op VariantProjectionBenchmark.readProjectedFileSchema Shredded ss 10 1616.585 ± 57.818 ms/op VariantProjectionBenchmark.readProjectedLeanSchema Unshredded ss 10 1777.288 ± 14.049 ms/op VariantProjectionBenchmark.readProjectedLeanSchema Shredded ss 10 723.679 ± 8.473 ms/op ``` points to note * full record read is now ~ 15% slower on both unshredded and shredded files just by adding a UUID. Surprisingly Expensive. * Eeading all records on a shredded file is still ~40% slower than on an unshredded one. * same odd behaviours on a projected schema -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
