pitrou commented on PR #37032: URL: https://github.com/apache/arrow/pull/37032#issuecomment-1669922511
I pushed a variation so that Table inputs are benchmarks with multiple chunk sizes. The results are interesting: the performance degradation with increasing numbers of chunks is much less apparent for Table inputs than for ChunkedArray inputs: ``` ----------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... ----------------------------------------------------------------------------------------------- FieldPathGetFromWideArray 701692 ns 701592 ns 974 items_per_second=14.2533M/s num_columns=10k FieldPathGetFromWideArrayData 402517 ns 402454 ns 1749 items_per_second=24.8476M/s num_columns=10k FieldPathGetFromWideBatch 691269 ns 691164 ns 996 items_per_second=14.4683M/s num_columns=10k FieldPathGetFromWideChunkedArray/2 34847633 ns 34822898 ns 17 items_per_second=287.167k/s num_chunks=50 num_columns=10k FieldPathGetFromWideChunkedArray/10 5530621 ns 5525915 ns 102 items_per_second=1.80966M/s num_chunks=10 num_columns=10k FieldPathGetFromWideChunkedArray/25 2403874 ns 2403105 ns 285 items_per_second=4.16128M/s num_chunks=4 num_columns=10k FieldPathGetFromWideChunkedArray/100 1318747 ns 1318526 ns 529 items_per_second=7.58423M/s num_chunks=1 num_columns=10k FieldPathGetFromWideTable/2 511357 ns 511244 ns 1347 items_per_second=19.5601M/s num_chunks=50 num_columns=10k FieldPathGetFromWideTable/10 390110 ns 390025 ns 1791 items_per_second=25.6394M/s num_chunks=10 num_columns=10k FieldPathGetFromWideTable/25 387858 ns 387783 ns 1802 items_per_second=25.7876M/s num_chunks=4 num_columns=10k FieldPathGetFromWideTable/100 387341 ns 387250 ns 1807 items_per_second=25.8231M/s num_chunks=1 num_columns=10k ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
