Weston Pace created ARROW-16138:
-----------------------------------

             Summary: [C++] Improve performance of ExecuteScalarExpression
                 Key: ARROW-16138
                 URL: https://issues.apache.org/jira/browse/ARROW-16138
             Project: Apache Arrow
          Issue Type: Improvement
          Components: C++
            Reporter: Weston Pace


One of the things we want to be able to do in the streaming execution engine is 
process data in small L2 sized batches.  Based on literature we might like to 
use batches somewhere in the range of 1k to 16k rows.  In ARROW-16014 we 
created a benchmark to measure the performance of ExecuteScalarExpression as 
the size of our batches got smaller.  There are two things we observed:

 * Something is causing thread contention.  We should be able to get pretty 
close to perfect linear speedup when we are evaluating scalar expressions and 
the batch size fits entirely into L2.  We are not seeing that.
 * The overhead of ExecuteScalarExpression is too high when processing small 
batches.  Even when the expression is doing real work (e.g. copies, 
comparisons) the execution time starts to be dominated by overhead when we have 
10k sized batches.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to