aokolnychyi commented on pull request #2945:
URL: https://github.com/apache/iceberg/pull/2945#issuecomment-894524869


   Here are some benchmark numbers for writing 2.5 million records (flat 
schema, 7 columns). I am using bucketing with 32 buckets on an int column for 
partitioned writes.
   
   ```
   Benchmark                                                                    
   Mode  Cnt   Score   Error  Units
   TaskWriterParquetBenchmark.writePartitionedDataNewFanoutWriter               
     ss    5  10.432 ± 0.382   s/op
   TaskWriterParquetBenchmark.writePartitionedDataOldFanoutWriter               
     ss    5  11.315 ± 0.345   s/op
   TaskWriterParquetBenchmark.writePartitionedDataNewWriter                     
     ss    5  11.416 ± 0.994   s/op
   TaskWriterParquetBenchmark.writePartitionedDataOldWriter                     
     ss    5  11.331 ± 0.238   s/op
   TaskWriterParquetBenchmark.writePartitionedEqualityDeleteNewWriter           
     ss    5  11.795 ± 1.553   s/op
   TaskWriterParquetBenchmark.writeUnpartitionedDataNewWriter                   
     ss    5  10.736 ± 1.058   s/op
   TaskWriterParquetBenchmark.writeUnpartitionedDataOldWriter                   
     ss    5  10.501 ± 2.084   s/op
   TaskWriterParquetBenchmark.writeUnpartitionedEqualityDeleteNewWriter         
     ss    5   9.935 ± 0.166   s/op
   
TaskWriterParquetBenchmark.writeUnpartitionedPositionDeleteWithoutRowNewWriter  
  ss    5   8.833 ± 0.791   s/op
   ```
   
   Memory-wise it is very similar. Here is an example.
   
   ```
   TaskWriterParquetBenchmark.writePartitionedDataNewWriter:·gc.alloc.rate      
                                      ss    5         177.302 ±        17.914  
MB/sec
   
TaskWriterParquetBenchmark.writePartitionedDataNewWriter:·gc.churn.G1_Eden_Space
                                   ss    5         136.865 ±        12.818  
MB/sec
   
TaskWriterParquetBenchmark.writePartitionedDataNewWriter:·gc.churn.G1_Old_Gen   
                                   ss    5           5.411 ±         0.646  
MB/sec
   TaskWriterParquetBenchmark.writePartitionedDataOldWriter:·gc.alloc.rate      
                                      ss    5         177.730 ±        11.985  
MB/sec
   
TaskWriterParquetBenchmark.writePartitionedDataOldWriter:·gc.churn.G1_Eden_Space
                                   ss    5         137.768 ±        21.407  
MB/sec
   
TaskWriterParquetBenchmark.writePartitionedDataOldWriter:·gc.churn.G1_Old_Gen   
                                   ss    5           5.420 ±         0.892  
MB/sec
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to