mapleFU commented on issue #38370: URL: https://github.com/apache/arrow/issues/38370#issuecomment-1774408763
Call Compute function using scalar is like volcano model in data base, it has the cost of: 1. Find the function ( in dispatch ) 2. Detect the input type 3. Compute -> this is the only logic we actually need 4. Wrap the output function The pure C++ code is a bit like the `codegen` in system. You already know the type(though reading from file might suffer from non-optimal performance). So computing using raw-C++ with self defined type would be faster. You can achive some similar performance using some template to compute the logic directly. So I don't think it's a good way if you can ensure the function call and know the input / output type. Also when I run benchmark localy, the performance mainly slower when: 1. Setup the framework. 2. Dispatch function So you may need to just benchmark the "compute time", rather than this. The initialize of arrow::compute might take some time. > by the way, is this code generated by chatgpt? No. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
