lidavidm opened a new pull request #7387:
URL: https://github.com/apache/arrow/pull/7387


   As discussed in #7299, this makes IpcPayload and related functions public, 
and adds `arrow::ipc::GetPayloadSize`. It also adds a synthetic benchmark 
comparing it to `GetRecordBatchSize`. (Obviously, a very unfair comparison.)
   
   ```
   -------------------------------------------------------------------------
   Benchmark                                  Time           CPU Iterations
   -------------------------------------------------------------------------
   GetIpcPayloadSize/1/real_time             18 ns         18 ns   39779690   
54.6317TB/s
   GetIpcPayloadSize/4/real_time             18 ns         18 ns   39345745   
54.2457TB/s
   GetIpcPayloadSize/16/real_time            18 ns         18 ns   39393839   
54.2546TB/s
   GetIpcPayloadSize/64/real_time            19 ns         19 ns   38973515   
51.9554TB/s
   GetIpcPayloadSize/256/real_time           18 ns         18 ns   37427051    
53.213TB/s
   GetIpcPayloadSize/1024/real_time          18 ns         18 ns   38466296    
55.367TB/s
   GetIpcPayloadSize/4096/real_time          19 ns         19 ns   38322691    
62.537TB/s
   GetIpcPayloadSize/8192/real_time          19 ns         19 ns   38556584   
72.0611TB/s
   GetRecordBatchSize/1/real_time         10308 ns      10298 ns      67088    
96.227GB/s
   GetRecordBatchSize/4/real_time         15564 ns      15547 ns      45002    
63.744GB/s
   GetRecordBatchSize/16/real_time        30790 ns      30759 ns      22630   
32.2385GB/s
   GetRecordBatchSize/64/real_time        87783 ns      87687 ns       8037   
11.3322GB/s
   GetRecordBatchSize/256/real_time      307918 ns     307583 ns       2292   
3.25852GB/s
   GetRecordBatchSize/1024/real_time    1215936 ns    1214092 ns        582   
873.888MB/s
   GetRecordBatchSize/4096/real_time    5498273 ns    5486982 ns        126   
221.522MB/s
   GetRecordBatchSize/8192/real_time   12711545 ns   12680450 ns         55   
112.191MB/s
   ```
   
   This _doesn't_ meet the goal of ARROW-8487, which was to allow clients to 
limit the size of record batches sent via Flight. I tried adding 
`RecordBatchWriter::WritePayload`, but this needs some thought; in particular, 
when you have dictionaries, you need to assemble the dictionary payloads 
yourself, but can't do so because you can't get the writer's internal 
DictionaryMemo.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to